How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites)
Dave Ebbelaar
@daveebbelaarAbout
Hi, I'm Dave, an AI engineer with over a decade of experience in artificial intelligence. At Datalumina, I lead the AI projects, where we're always pushing the boundaries of what's possible while still building reliable applications. I'm not going to impress you with theory, but teach you (without hype) how to build real AI systems, using lasting engineering principles. Along the way: - 10M+ views across YouTube, LinkedIn, and courses - Helped 1M+ developers get started with AI - Building and scaling a 7-figure AI company - Delivered 50+ custom B2B AI solutions - Consulted for TimescaleDB, ClickUp, and n8n - Helped 500+ developers launch freelance careers - BSc + MSc in Artificial Intelligence (VU Amsterdam) I post about AI engineering, building real systems, and what it takes to go independent as a developer. If you want to build AI that actually works, or build a freelance career around it, follow along 👊🏻
Latest Posts
Video Description
Want to start freelancing? Let me help: https://academy.datalumina.com/freelance Want to learn real AI Engineering? Go here: https://academy.datalumina.com/accelerator 💼 Need help with a project? Work with me: https://www.datalumina.com/ 🔗 GitHub Repository https://github.com/daveebbelaar/ai-cookbook/tree/main/knowledge/docling 🛠️ My VS Code / Cursor Setup https://youtu.be/mpk4Q5feWaw ⏱️ Timestamps 0:45 Building an Extraction Pipeline 2:15 Document Conversion Basics 6:12 HTML Extraction Techniques 9:10 Chunking Data for AI 14:22 Storing in Vector Databases 19:51 Searching the Vector Database 22:16 Creating an Interactive Application 📌 Description In this Docling tutorial, you will learn to extract and structure data from various documents, utilizing techniques such as parsing, chunking, and embedding. A walkthrough of Docling and a practical demonstration illustrate these processes. The video also explores integrating vector databases for efficient data storage and enhancing AI responses through embedding models. Finally, a simple interactive chat application is demonstrated, showcasing the completed knowledge extraction pipeline and optimization strategies. 👋🏻 About Me Hi! I'm Dave, AI Engineer and founder of Datalumina®. On this channel, I share practical tutorials that teach developers how to build production-ready AI systems that actually work in the real world. Beyond these tutorials, I also help people start successful freelancing careers. Check out the links above to learn more!
AI-Ready Data Management Essentials
AI-recommended products based on this video

Redragon M612 Predator RGB Gaming Mouse, 8000 DPI Wired Optical Mouse with 11 Programmable Buttons & 5 Backlit Modes, Software Supports DIY Keybinds Rapid Fire Button

Microsoft Office Home & Student 2019 | One-time purchase, 1 person | PC/Mac Keycard



















