Build Better LLM Apps with Assertion-Based Unit Tests

Dave Ebbelaar • November 27, 2024

Dave Ebbelaar

@daveebbelaar

About

Hi, I'm Dave, an AI engineer with over a decade of experience in artificial intelligence. At Datalumina, I lead the AI projects, where we're always pushing the boundaries of what's possible while still building reliable applications. I'm not going to impress you with theory, but teach you (without hype) how to build real AI systems, using lasting engineering principles. Along the way: - 10M+ views across YouTube, LinkedIn, and courses - Helped 1M+ developers get started with AI - Building and scaling a 7-figure AI company - Delivered 50+ custom B2B AI solutions - Consulted for TimescaleDB, ClickUp, and n8n - Helped 500+ developers launch freelance careers - BSc + MSc in Artificial Intelligence (VU Amsterdam) I post about AI engineering, building real systems, and what it takes to go independent as a developer. If you want to build AI that actually works, or build a freelance career around it, follow along 👊🏻

Latest Posts

PT4M

Build a Complete End-to-End GenAI Project in 3 Hours

Dave Ebbelaar4 months ago

42306

PT4M

How to Combine RAG with Real-Time Web Search (Single Page, Search, Allowed Domains)

Dave Ebbelaar5 months ago

7316

PT4M

How I'd Learn GenAI in 2026 (if I could start over)

Dave Ebbelaar5 months ago

28403

PT4M

Python for AI - Full Beginner Course

Dave Ebbelaar5 months ago

390761

Video Description

Want to start freelancing? Let me help: https://academy.datalumina.com/freelance Want to learn real AI Engineering? Go here: https://academy.datalumina.com/accelerator 💼 Need help with a project? Work with me: https://www.datalumina.com/ 🔗 Article https://applied-llms.org/ 🛠️ My Development Workflow https://youtu.be/3sIzCFuLgIQ ⏱️ Timestamps 0:10 Introduction to LLM Evaluation Techniques 2:46 Understanding Data Processing Steps 3:59 Writing Assertions for LLM Outputs 6:39 Structuring Your Evaluation Logic 📌 Description In this video, I discuss practical evaluation techniques for enhancing the reliability of large language model (LLM) applications. I introduce assertion-based unit tests and methods to capture real-world input data, enabling effective analysis of customer interactions. I highlight the importance of structured outputs with the Instructor library and demonstrate how multiple assertions can validate system responses. Additionally, I discuss organizing code for better maintenance and recommend the observability platform Langfuse for tracking API calls. Finally, I share insights on a boilerplate project for event-driven LLM applications and tips for developers transitioning into freelancing.