Ho to Test and Evaluate AI Agents - an introduction

Name: Ho to Test and Evaluate AI Agents - an introduction
Rating: 4.4 (5 reviews)

The intro course on how to test, measure, and improve AI agent behavior using modern evaluation tools

Created byDan Andrei Bucureanu

Last updated 11/2025

English

What you'll learn

Understand the Fundamentals of AI Agent Testing
Design and Execute Systematic AI Agent Tests
Implement RAG (Retrieval-Augmented Generation) Evaluation
Understand Functional Testing of AI Agents
Understand Non-Functional Testing of AI Agents
Understand how to evaluate the Goal completion metrics
Understand how to evaluate the task completion metrics
Understand how to evaluate the plan creation metrics
Understand cost and efficiency evaluation
Compare Deterministic vs. Agentic vs. Autonomous Systems

Course content

7 sections • 43 lectures • 3h 54m total length

Introduction3:05
About your instructor2:00
AI Tech stack4:04
Types of applications that use AI / LLMs5:48

Requirements

Will do learn
Curisity
Basic AI know how
Basic Testing Experience
Basic Software know how
no coding experience needed

Description

What You’ll Learn

Artificial Intelligence agents are no longer static chatbots, they plan, reason, and act autonomously. This course teaches you how to systematically test, measure, and validate AI agent behavior using the latest tools and frameworks.

Through real-world Python examples and structured exercises, you’ll learn how to evaluate both functional and non-functional aspects of AI systems; from goal completion and plan accuracy to efficiency and bias detection.

By the end of this course, you’ll know how to design robust AI evaluation pipelines, implement RAG (Retrieval-Augmented Generation) tests, and confidently report metrics that reflect true agent performance.

Course Modules

Understand the Fundamentals of AI Agent Testing
Learn what makes AI agents unique — from autonomy and planning to tool-use and decision-making.
Design and Execute Systematic AI Agent Tests
Build a repeatable test strategy using structured test cases, reproducible results, and automated evaluation scripts.
Implement RAG (Retrieval-Augmented Generation) Evaluation
Evaluate how effectively an agent retrieves and integrates external knowledge sources.
Understand Functional Testing of AI Agents
Test accuracy, correctness, and behavior alignment with expected outcomes.
Understand Non-Functional Testing of AI Agents
Measure efficiency, robustness, reliability, and responsiveness in complex or dynamic environments.
Evaluate Key Agent Metrics
- Goal Completion
- Task Execution
- Plan Creation
- Cost and Efficiency
Compare Deterministic vs. Agentic vs. Autonomous Systems
Understand the testing implications across AI system maturity levels.

Tools & Frameworks Covered:

DeepEval and GEval for metric-based evaluation
RAGAS for assessing retrieval-based systems
Python for implementing automated test pipelines

By the End of This Course, You Will Be Able To:

Design a complete AI agent testing strategy from scratch
Implement functional and non-functional AI validation frameworks
Apply objective metrics for task, goal, and efficiency evaluation
Test RAG pipelines for retrieval and answer accuracy
Distinguish between deterministic, agentic, and autonomous systems
Build a portfolio project that demonstrates your AI testing expertise

Who this course is for:

Software Testers & QA Engineers
AI / ML Engineers
Data Scientists & NLP Practitioners
AI Product Managers & Tech Leads
Quality Enthusiasts Curious About AI Testing

Ho to Test and Evaluate AI Agents - an introduction

What you'll learn

Explore related topics

Course content

Introduction4 lectures • 15min

Environment Setup5 lectures • 12min

Introduction to AI Agents7 lectures • 39min

Well knows agents Protocols2 lectures • 11min

How to test AI Agents - Core Functional Correctness12 lectures • 1hr 15min

Operational and Non - Functional Testing4 lectures • 24min

RAGAS Agent testing Framework9 lectures • 59min

Requirements

Description

Who this course is for: