
Walk through the entire platform architecture end to end. You'll see how users hit the API layer, how agents run as FastAPI servers on EKS, how gateway services decouple agents from infrastructure, and how observability, security, and deployment tie it all together.
Get your local development environment running. We install dependencies, configure Docker, set up the project with `docker compose up`, and verify everything works so you can follow along with the rest of the course. Includes live terminal demo.
Tour the repository structure , the agent services, gateway services, shared core package. Understand how the folders map to the architecture and where to find every component discussed in the course.
Make your first API call to Amazon Bedrock using the Converse API. Learn how to send messages, handle responses, configure model parameters, and understand the request/response lifecycle that every agent in the platform uses.
Improve LLM output quality with two essential prompting techniques. Chain of Thought forces step-by-step reasoning for complex problems. Few-Shot provides example input-output pairs to guide the model's behavior. See before-and-after comparisons that show why these matter.
Understand Retrieval-Augmented Generation from the ground up. Learn how text becomes numerical embeddings, how vector similarity search finds relevant documents, and how retrieved context gets injected into prompts. This is the foundation for the Retrieval Gateway you'll build later.
Give LLMs the ability to take actions. Learn how function calling works , the model generates structured JSON matching a tool schema, your code executes the function, and the result feeds back into the conversation. This is the mechanism behind every agent tool in the platform.
Build multi-step workflows where the output of one LLM call becomes the input to the next. Learn when sequential processing is the right choice, how to pass context between steps, and how to handle failures mid-chain. Includes live terminal demo.
Route incoming requests to the right handler based on content. Build a classifier that analyzes input and dispatches to specialized prompts or agents. This is the pattern behind customer support triage, intent detection, and multi-skill agents. Includes live terminal demo.
Execute multiple independent LLM calls simultaneously to reduce latency. Learn when tasks can safely run in parallel, how to fan out and aggregate results, and the trade-offs between parallel and sequential processing. Includes live terminal demo.
Break complex tasks into subtasks using an orchestrator that delegates to specialized workers and aggregates their results. This pattern powers research agents, code generation pipelines, and any task too complex for a single LLM call. Includes live terminal demo.
Build agents that improve their own output. An evaluator scores the result, and if it doesn't meet the quality threshold, an optimizer refines it in a feedback loop. Learn how to set quality criteria, avoid infinite loops, and know when the output is good enough. Includes live terminal demo.
Explore the shared core package that every agent uses. Understand the unified request/response models (Pydantic), the middleware stack (auth, logging, error handling), and the gateway clients that connect agents to LLM, Memory, and Retrieval services.
Build your first production agent using the Strands framework. Walk through the server-controller-agent architecture: FastAPI handles HTTP, the controller manages business logic, and Strands powers the AI. See how the platform abstractions keep your agent code clean.
Build an agent that retrieves relevant documents before responding. Connect to Amazon Bedrock Knowledge Base through the Retrieval Gateway, inject retrieved context into the agent's prompt, and see how RAG transforms a basic chatbot into a knowledge-aware assistant.
Build a graph-based agent using LangGraph. Define nodes for each processing step, connect them with conditional edges, and manage state as the conversation flows through the graph. See how the same platform abstractions work with a completely different framework.
Understand why the platform uses converters to translate between framework-specific formats and a unified platform format. See how you can swap Strands for LangGraph (or any future framework) without changing your middleware, gateway clients, or deployment infrastructure.
Understand MCP , the open standard for connecting AI models to external tools and data sources. Learn the client-server architecture, how tools are discovered and invoked, and why MCP eliminates custom integration code for every new tool.
Build a working MCP server that exposes Bedrock Knowledge Base as a standardized tool. Walk through tool registration, request handling, and response formatting. Any MCP-compatible agent can now use your knowledge base without custom code.
Orchestrate multiple agents working together. Learn delegation patterns where one agent hands off subtasks to specialists, and graph-based orchestration where conditional edges determine which agent runs next based on conversation state.
See MCP in a real-world scenario. The Jira Agent connects to Jira through an MCP server, enabling it to create issues, query boards, and update tickets. Understand how the same pattern applies to any external service , Slack, GitHub, databases, or internal APIs.
Explore the LLM Gateway powered by LiteLLM. Understand how it provides a unified interface to multiple models (Claude, Titan, etc.), enforces per-user rate limits, tracks costs, and handles failover , all without agents knowing which model they're calling.
Explore the Memory Gateway that gives agents persistent memory. Understand how conversation history is stored in Aurora PostgreSQL, how sessions are managed, and why centralizing memory as a service means any agent can recall past conversations without duplicating storage logic.
Explore the Retrieval Gateway that connects agents to knowledge. Understand how it wraps Bedrock Knowledge Base, performs vector similarity search, and returns relevant documents. Learn why retrieval is a gateway service and how it enables RAG across every agent on the platform.
Build the AWS foundation with Terraform. Walk through the modular infrastructure — VPC, subnets, EKS cluster, Aurora PostgreSQL, Bedrock access, and Cognito. Understand why Terraform modules keep infrastructure composable, testable, and reusable. Includes live terminal demo.
Deploy services to Kubernetes using Helm. Walk through the Helm chart structure for each service — deployments, services, configmaps, and ingress rules. Understand how the ALB Ingress Controller routes traffic and how Helm values make deployments configurable. Includes live terminal demo.
Implement production security across the platform. Set up Cognito for M2M authentication with OAuth2 Client Credentials flow, IRSA for pod-level IAM roles (so agents never hold AWS credentials), and External Secrets Operator for syncing secrets from AWS Secrets Manager into Kubernetes.
Instrument the platform with OpenTelemetry. Set up distributed tracing across agents and gateways, collect metrics, and correlate logs. Understand how traces flow through CloudFront → ALB → Agent → Gateway and how X-Ray and CloudWatch give you visibility into every request.
Watch the complete platform in action. We verify running services on EKS, obtain a Cognito access token, send messages through CloudFront, prove memory persistence (Claude remembers your name), query the Memory Gateway directly, and trace the full request flow through every service. Includes full live demo.
Now that you've built and understood the platform, learn how to extend it. We cover adding new agents, integrating additional MCP servers, scaling for production traffic, and resources for deepening your AWS and AI engineering skills.
AI agents are everywhere. Production AI systems are not.
Most courses stop at prompts and demos. This course teaches you how to design, build, and deploy a production-grade agentic AI platform on AWS, the same way engineering teams build real systems at scale.
Across 31 hands-on lessons, you will build a complete multi-service platform from the ground up using Python, AWS Bedrock (Claude and Titan models), Terraform, Kubernetes (EKS), FastAPI, Docker, and Helm. This is not a toy project. It is a full production system with authentication, memory, retrieval, orchestration, observability, and secure service-to-service communication, all deployed on real AWS infrastructure.
What you will build and learn:
Agentic AI patterns: chaining, routing, parallelization, orchestrator-worker, and evaluator-optimizer workflows using LangGraph and Strands Agents
Retrieval-Augmented Generation (RAG) with Bedrock Knowledge Bases, OpenSearch vector search, and a dedicated Retrieval Gateway
Multi-agent systems with delegation, tool use, function calling, and Model Context Protocol (MCP)
LLM Gateway architecture: model routing, abstraction, streaming, and cost control across Large Language Models
Memory and state management with PostgreSQL (Aurora), Redis (ElastiCache), and persistent agent memory
Observability and monitoring using OpenTelemetry, AWS X-Ray, and CloudWatch for full trace visibility across agents
Infrastructure as Code: provision and deploy everything with Terraform and Kubernetes (EKS) using production Helm charts
Prompt engineering fundamentals: chain-of-thought, few-shot examples, and structured evaluation techniques
What makes this course different:
You will not just copy code. Every architectural decision is explained, why each service exists, what trade-offs were made, and how the components fit together. You will understand how to move from a single notebook prototype to a scalable, secure, enterprise-ready AI platform.
Who this course is for:
Software engineers, backend developers, and DevOps/platform engineers who want to build production LLM-powered applications
ML engineers and data scientists moving from experimentation to production agentic AI systems
Technical leads and architects evaluating how to structure AI platforms for their organizations
Prerequisites:
Intermediate Python proficiency
Basic familiarity with AWS (an AWS account is needed for the labs)
Comfort with the command line and containers (Docker basics)
Basic familiarity with Kubernetes and Terraform is helpful for the deployment sections, but not strictly required
No prior AI/ML experience required. We cover the fundamentals before going deep
By the end of this course, you will have the skills and confidence to architect, deploy, and operate production agentic AI systems in real enterprise environments. Stop building demos. Start building production AI platforms.