Agent evaluation with ADK & Vertex AI | The Agent Factory Podcast
About
No channel description available.
Latest Posts
Video Description
Learn how to effectively evaluate your AI agent and ensure it performs reliably in production. This episode of The Agent Factory is your definitive guide on Agent Evaluation, showing you how to go from local testing with the Agent Development Kit (ADK) to large scale, enterprise grade evaluation using Vertex AI. We break down how to implement a full-stack agent evaluation strategy, including how to use ADK for fast debugging and golden dataset creation, and how Vertex AI's GenAI Evaluation service scales your testing with the LLM as a judge approach. Don't launch an agent you can't trust—watch to learn how to measure outcome, reasoning, tool use, and memory. Want to build production ready agents? Don't miss an episode! In this episode you'll learn: 1️⃣ How to evaluate the agent's system level behavior, not just its output. 2️⃣ The 5 step inner loop workflow for testing agents with ADK (Agent Development Kit). 3️⃣ How to use Vertex AI for production scale, qualitative agent evaluation. 4️⃣ The unique challenges of testing and evaluating multi-agent systems (A2A). 5️⃣ Techniques for generating synthetic data to solve the evaluation cold start problem. About The Agent Factory: "The Agent Factory" is a video first technical podcast for developers, by developers, focused on building production ready AI agents. We explore how to design, build, deploy, and manage agents that bring real value. 🔗 Resources & links mentioned: ➖ Google's Agent Development Kit (ADK) evaluation guide → https://goo.gle/3KshHIu ➖ Google's Agent Development Kit (ADK) → https://goo.gle/3Kq6Lex ➖ Vertex AI GenAI Evaluation Service → https://goo.gle/3ICTMpe ➖ How to evaluate generated answers from RAG at scale on Vertex AI → https://goo.gle/4o1oh7p ➖ How to evaluate LLMs with custom criteria using Vertex AI AutoSxS → https://goo.gle/46GfMYg, https://goo.gle/3IOMjDt Subscribe to The Agent Factory → https://www.youtube.com/playlist?list=PLIivdWyY5sqLXR1eSkiM5bE6pFlXC-OSs 🔔 Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech #AgentEvaluation #EvaluateTheAgent #ADK #VertexAI #AIAgents #AI #Payments Speakers: Annie Wang Ivan Nardini Products Mentioned: ADK, Vertex AI, A2A
Essential Tools for AI Agent Evaluation
AI-recommended products based on this video

TP-Link Tapo 2K Pan/Tilt Indoor Security WiFi Camera, Baby & Pet Camera w/ 360° Motion Tracking, 2-Way Audio, Night Vision, Cloud & Local Storage (Up to 256 GB), Works w/ Alexa & Google (Tapo C210)

TP-Link Tapo 3K 5MP Pan/Tilt Security WiFi Camera, Baby & Pet Camera, 360° Motion Tracking, 2-Way Audio, 40Ft. Night Vision, Cloud & Local Storage (Up to 512 GB), Works w/Alexa & Google (Tapo C230)

YMZ Smart Watch for Men Women, IP68 Waterproof, Bluetooth Calling, Fitness Tracker, Sleep Monitoring, 1.85'' DIY Dial, AI Built-in, 100+ Sports Modes, 7 Days Battery, Compatible with Android & iPhones

Mouth guard for Grinding Teeth, Comfortable Night Guard for Grinding Teeth, Perfect fit Anti Grinding Mouthguard for Adults, Mouth Guard for Sleeping with 2 Size Available

Mouth guard for Grinding Teeth, Comfortable Night Guard for Grinding Teeth, Perfect fit Anti Grinding Mouthguard for Adults, Mouth Guard for Sleeping with 2 Size Available



















