Udemy Business

Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Datadog LLM Observability: Monitor & Trace AI in Production

Name: Datadog LLM Observability: Monitor & Trace AI in Production
Rating: 4.5 (10 reviews)

Master enterprise AI monitoring with tracing, evaluations, cost control, and security compliance using Datadog Platform

Highest Rated

Created byPaulo Dichone | Software Engineer, AWS Cloud Practitioner & Instructor

Last updated 3/2026

English

What you'll learn

Instrument LLM applications with Datadog's ddtrace SDK for full visibility into prompts, completions, and token usage
Trace complex AI agent workflows including multi-turn conversations, tool calls, and decision paths with enterprise-grade debugging
Implement production evaluations using managed checks (toxicity, relevancy) and custom LLM-as-a-judge evaluators
Monitor and optimize LLM costs with automated cost tracking, budget alerts, and model comparison dashboards
Run experiments to test prompt and model changes before production deployment using Datadog's experimentation framework
Build secure AI systems with PII scrubbing, compliance patterns, and security monitoring for enterprise requirements
Instrument RAG pipelines with custom spans for embedding, retrieval, and generation steps for complete workflow visibility
Integrate LLM observability with existing Datadog APM, infrastructure, and security tools for unified enterprise monitoring

Course content

8 sections • 32 lectures • 4h 8m total length

What You'll Learn1:00
Explore the enterprise value of Datadog observability, set up LLM observability, and instrument LLM applications with hands-on tracing of AI workflows, including quality, cost, security, compliance, and production patterns.
Why LLM Observability Matters for Enterprises5:30
Understand why LLM observability is essential for enterprise production, addressing non-deterministic AI, cost, security, and performance with real-time traces and dashboards.
Datadog LLM Observability – Core Capabilities and Dashboard Demo5:15
Explore Datadog LLM observability’s four core capabilities—tracing, evaluations, experiments, and cost monitoring—for debugging, quality checks, AB testing, and budget-aware production insights.
Section 1 Knowledge Check: LLM Observability Fundamentals

Datadog Account Setup and Testing – Hands-on6:44
Sign up for the Datadog free trial, configure a Python LLM app, and load environment variables to run and trace a chat completion in production.
Span Types and SDK Integrations Overview4:45
Explore span types and SDK integrations to observe LLM traces, including LLM, workflow, agent, tool retrieval, and embedding spans, and review duration, input tokens, and costs with the Python SDK.
First Traced LLM Call in a Local Environment – Hands-on9:33
Quick Check in0:55
Section 2 Knowledge Check: Setting Up LLM Observability

Creating LLM Spans with Annotations and Tags – Hands-on11:43
Instrument direct llm api calls with custom spans and annotations to build multi-step pipelines and enterprise observability using llm obs decorators and metadata tags.
Instrumenting Multi-step Workflows – Hands-on15:46
Instrument multi-step LLM workflows in production by tracing embedding, retrieval, context assembly, and generation, enabling nested spans and detailed latency, metadata, and error visibility.
RAG Pipeline with Full Observability – Hands-on13:53
Demonstrates a hands-on rag pipeline with full observability using Datadog LLM Observability, including ChromaDB vector store, cosine similarity, embeddings, retrieval, and LLM-generated responses.
LangChain Integration – Part 18:44
LangChain RAG Pipeline Auto-Instrumentation – Hands-on6:55
Demonstrates LangChain RAG pipeline auto-instrumentation with zero-code tracing, using a vector store, embeddings, and a chat prompt template to query enterprise docs with Datadog LLM observability.
Section 3 Knowledge Check: Instrumenting LLM Applications

Tracing Agentic Workflows – Hands-on15:19
Trace and monitor non-deterministic AI agents in production with Datadog, visualizing decision paths, tool calls, and multi-agent orchestration through a hands-on customer support agent example.
Multi-Agent Systems – Hands-on16:08
Explore the orchestrator plus workers pattern in enterprise ai, with a pipeline of specialized agents (research, analysis, generation, validation) that plan, execute, and synthesize results, plus end-to-end tracing.
Debugging Agent Issues – Overview5:10
Explore common agent debugging scenarios like infinite loops, wrong tool selection, and latency spikes, and use state updates, termination conditions, step limits, prompt refinement, and observability to trace issues.

LLM Experiments Overview and Dataset Creation – Hands-on9:25
Create and manage data sets, run LLM experiments with evaluators, and compare results in Datadog LLM Observability to make educated deployment decisions.
Generating a Golden Evaluation Set – Hands-on2:54
Create a golden evaluation set for production support data sets in LLM experiments, covering easy baseline, hard, adversarial, and off topic categories, with metadata labels to filter experiment results.
Running LLM Experiments in Datadog – Dashboards and Comparisons16:17
Learn to run end-to-end LLM experiments in Datadog by pairing a dataset, a task, and evaluators including contains key info, semantic similarity, and safety checks, then compare results in dashboards.
A/B Testing Prompts – Full Workflow – Hands-on9:30
Demonstrates A/B testing prompts to compare concise versus empathetic variants using a designed data set, evaluating empathy scores, actionability, and semantic similarity to pick the better prompt.
Setting Up Evaluations and Quality Monitoring – Hands-on22:40
Configure and manage automated quality evaluations for LLM outputs using Datadog observability, building datasets from production traces and running experiments before deployment. Track metrics such as toxicity, topic relevancy, failure to answer, and completions with zero configuration, dashboards, and alerts across OpenAI, Azure OpenAI, and Google Cloud Vertex AI.
Creating Custom Evaluations2:47
Create an evaluation in LM Observability by naming the eval, attaching an LM account, selecting GPT-4 mini, configuring a system prompt and input variables, and set pass criteria for monitoring.
Evaluations and Monitoring in Code Only – Hands-on15:00

Security, Compliance, and Production – Overview3:15
Adopt enterprise-ready patterns for production LLMs, including SOC 2, HIPAA, GDPR, PII reduction, and audit trails, with optional PII scrubbing via a sensitive data scanner before or after processing.
Setting Up PII Redaction Function and Testing - Hands-on5:30
Set up and test a PII scrubber using regex to redact emails and credit cards, then scrub inputs before LLM annotation and verify redacted data appears in Datadog traces.
Data Scanning Dashboard Overview4:21
Explore the sensitive data scanner in organization settings, configure code and storage scanning, connect to GitHub, GitLab, and Azure DevOps, and manage scanning rules and groups for llm observability.
Testing a Custom PII Redaction Group in Datadog – Hands-on7:17
Learn hands-on how to configure a Datadog llm observability data scanner, build a custom pii redaction group, and verify redactions for ssn, passport, and emails in production dashboards.
LLM Apps Security and Compliance and production Hands – Hands-on5:28
Design secure, compliant llm apps with a pii scrubber and security monitor to detect prompt-injection patterns and scrub data before responses, using Datadog dashboards.
Production Deployment Architecture and Checklist2:36
Explore a quick production deployment architecture for LLM observability in Datadog, where the application uses the DTrace SDK or agentless mode and enables APM correlation across traces, logs, and metrics.

Requirements

Basic Python programming experience (intermediate level)
Familiarity with LLM APIs (OpenAI, Anthropic, or similar)
Datadog account (free trial available) or ability to create one
Basic understanding of LLM concepts (prompts, tokens, completions)
Optional: Experience with LangChain or similar LLM frameworks

Description

Are your LLM applications running blind in production?

You've deployed an AI agent, a RAG pipeline, or an LLM-powered chatbot.

But can you answer these questions?

How much did that runaway agent loop cost before someone noticed?
Why did hallucination rates spike last Tuesday?
Which step in your RAG pipeline is returning irrelevant documents?
How do you prove to compliance that you're protecting customer PII in LLM conversations?

If you can't answer these questions with data, you have a production problem.

Traditional APM tools see your LLM as a black box. They measure latency and error rates, but they can't show you token flows, prompt effectiveness, or quality degradation.

LLMs are fundamentally different—non-deterministic, multi-step, token-priced, and quality-sensitive.

You need LLM-native observability.

Introducing Datadog LLM Observability

This course is the definitive guide to Datadog's LLM Observability platform for enterprise teams.

If you're already using Datadog for APM, infrastructure, or security, this integrates directly into your existing stack—no new tools to learn, no separate dashboards to monitor.

What you'll build:

Throughout this course, you'll instrument a production-grade Customer Support AI Agent with:

Multi-turn conversation tracing
Tool integration (order lookup, refund processing)
Custom quality evaluations
Cost monitoring dashboard
PII scrubbing compliance

This isn't a toy example—it's the architecture real enterprise teams deploy.

Who this course is for:

Enterprise Teams already using Datadog for APM/Infrastructure who want unified visibility into their AI workloads
Technical Leads & Architects evaluating or implementing LLM observability solutions within existing Datadog ecosystems
Platform Engineers building internal AI infrastructure who need to provide observability standards for development teams
ML Engineers & AI Engineers building LLM-powered applications who need production-grade monitoring and debugging capabilities

Datadog LLM Observability: Monitor & Trace AI in Production

What you'll learn

Explore related topics

Course content

Introduction & Enterprise Value3 lectures • 12min

Setting Up LLM Observability4 lectures • 22min

Instrumenting LLM Applications5 lectures • 57min

Tracing Agentic AI Workflows3 lectures • 37min

Evaluations & Quality Monitoring7 lectures • 1hr 19min

Cost Monitoring & Optimization2 lectures • 12min

Security, Compliance & Production Patterns6 lectures • 28min

Bonus2 lectures • 2min

Requirements

Description

Who this course is for: