Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Datadog LLM Observability: Monitor & Trace AI in Production
Highest Rated
Rating: 4.5 out of 5(10 ratings)
241 students

Datadog LLM Observability: Monitor & Trace AI in Production

Master enterprise AI monitoring with tracing, evaluations, cost control, and security compliance using Datadog Platform
Last updated 3/2026
English

What you'll learn

  • Instrument LLM applications with Datadog's ddtrace SDK for full visibility into prompts, completions, and token usage
  • Trace complex AI agent workflows including multi-turn conversations, tool calls, and decision paths with enterprise-grade debugging
  • Implement production evaluations using managed checks (toxicity, relevancy) and custom LLM-as-a-judge evaluators
  • Monitor and optimize LLM costs with automated cost tracking, budget alerts, and model comparison dashboards
  • Run experiments to test prompt and model changes before production deployment using Datadog's experimentation framework
  • Build secure AI systems with PII scrubbing, compliance patterns, and security monitoring for enterprise requirements
  • Instrument RAG pipelines with custom spans for embedding, retrieval, and generation steps for complete workflow visibility
  • Integrate LLM observability with existing Datadog APM, infrastructure, and security tools for unified enterprise monitoring

Course content

8 sections32 lectures4h 8m total length
  • What You'll Learn1:00

    Explore the enterprise value of Datadog observability, set up LLM observability, and instrument LLM applications with hands-on tracing of AI workflows, including quality, cost, security, compliance, and production patterns.

  • Why LLM Observability Matters for Enterprises5:30

    Understand why LLM observability is essential for enterprise production, addressing non-deterministic AI, cost, security, and performance with real-time traces and dashboards.

  • Datadog LLM Observability – Core Capabilities and Dashboard Demo5:15

    Explore Datadog LLM observability’s four core capabilities—tracing, evaluations, experiments, and cost monitoring—for debugging, quality checks, AB testing, and budget-aware production insights.

  • Section 1 Knowledge Check: LLM Observability Fundamentals

Requirements

  • Basic Python programming experience (intermediate level)
  • Familiarity with LLM APIs (OpenAI, Anthropic, or similar)
  • Datadog account (free trial available) or ability to create one
  • Basic understanding of LLM concepts (prompts, tokens, completions)
  • Optional: Experience with LangChain or similar LLM frameworks

Description

Are your LLM applications running blind in production?

You've deployed an AI agent, a RAG pipeline, or an LLM-powered chatbot. 

But can you answer these questions?

  • How much did that runaway agent loop cost before someone noticed?

  • Why did hallucination rates spike last Tuesday?

  • Which step in your RAG pipeline is returning irrelevant documents?

  • How do you prove to compliance that you're protecting customer PII in LLM conversations?


  If you can't answer these questions with data, you have a production problem.


Traditional APM tools see your LLM as a black box.  They measure latency and error rates, but they can't show you token flows, prompt effectiveness, or quality degradation.

LLMs are fundamentally different—non-deterministic, multi-step, token-priced, and quality-sensitive.


  You need LLM-native observability.


Introducing Datadog LLM Observability

This course is the definitive guide to Datadog's LLM Observability platform for enterprise teams. 

If you're already using Datadog for APM, infrastructure, or security, this integrates directly into your existing stack—no new tools to learn, no separate dashboards to monitor.


  What you'll build:

  Throughout this course, you'll instrument a production-grade Customer Support AI Agent with:

  • Multi-turn conversation tracing

  • Tool integration (order lookup, refund processing)

  • Custom quality evaluations

  •   Cost monitoring dashboard

  •    PII scrubbing compliance


  This isn't a toy example—it's the architecture real enterprise teams deploy.

Who this course is for:

  • Enterprise Teams already using Datadog for APM/Infrastructure who want unified visibility into their AI workloads
  • Technical Leads & Architects evaluating or implementing LLM observability solutions within existing Datadog ecosystems
  • Platform Engineers building internal AI infrastructure who need to provide observability standards for development teams
  • ML Engineers & AI Engineers building LLM-powered applications who need production-grade monitoring and debugging capabilities