Advanced RAG: Build & Deploy Production GenAI Apps

Name: Advanced RAG: Build & Deploy Production GenAI Apps
Rating: 4.6 (48 reviews)

Multi-Agent RAG, CrewAI, AutoGen, Microsoft Agent Framework, RAG, Langchain, Deep RAG, Production RAG, RAGWire

Bestseller

Highest Rated

Created byKGP Talkie | Laxmi Kant

Last updated 4/2026

English

What you'll learn

Build a production RAG pipeline with BM25 hybrid search, RRF fusion, and Qdrant vector database
Build agentic RAG systems with LangChain, LangGraph self-correcting agents, and supervisor workflows
Build multi-agent RAG with CrewAI, Microsoft AutoGen, and Microsoft Agent Framework
Deploy RAG agents to AWS ECS Fargate, GCP Cloud Run, Azure, Railway, and Render with Docker
Build a FastAPI backend with OpenAI-compatible endpoints, SSE streaming, and Postman testing
Build a production Chainlit chat UI with authentication, chat history, and document ingestion
Configure RAGWire with OpenAI GPT, Groq, Google Gemini, Ollama, and HuggingFace embeddings
Implement LLM-driven auto metadata filtering over complex nested document structures in Qdrant

Course content

11 sections • 114 lectures • 11h 0m total length

Introduction2:36
Course Introduction!
What You Will Learn in This Course!5:43
Build a chainlit front-end connected to a back-end agentic system and vector database, enabling data uploads, OpenAI-compatible endpoints, and production-grade RAG app deployment.
Download Code Files0:01
Getting Started with The Course5:16
Environment Setup - PIP, UV, Anaconda and Requirements.txt6:00
LangSmith Setup: Debug LangChain RAG Pipelines5:35
Install Docker and Qdrant Vector DB Locally5:30
Ollama Setup: Run Qwen 3.5 and Gemma 4 Locally6:46

RAGWire Preview: Hybrid Search RAG in Production1:00
Discover Ragwire, a production-grade REG framework that connects vector databases to document converters and supports multiple LLMs like Ullama, OpenAI, Anthropic, and Grok. Compare Ragwire with basic REG and explore its architecture for easier deployment.
RAGWire: Open-Source Production RAG Toolkit Explained4:44
RAGWire presents a three-part production rag pipeline, ingest into a vector store, query relevant chunks, and generate answers with an llm, plus smart markdown chunking and dense plus sparse embeddings.
RAGWire Query Pipeline: Dense + Sparse Search with RRF4:49
RAGWire End-to-End: Ingestion, Retrieval and Reranking5:10

RAGWire Setup Overview: What You Will Build1:49
RAGWire Installation: Python Environment Setup3:53
config.yaml Part 1: Embedding Model Configuration6:49
config.yaml Part 2: Qdrant and Retrieval Settings6:53
config.yaml Part 3: LLM and Generation Settings4:30
Run RAGWire Locally with Ollama and Qwen 3.53:04
RAGWire Jupyter Notebook: Interactive Dev Environment3:53
Import env vars and enable langsmith tracing, then set up rag wire, check version, and configure info-level logging. Load config.yaml with ulama embedding, ulama llm, vector store, and hybrid retriever.
Connect RAGWire to Qdrant Vector Database via Config5:48
Connect rag wire to Qdrant vector database by loading config.yaml, initializing document loader, text splitter, and embedding model, then create and verify a hybrid search enabled collection on localhost 6333.
Document Ingestion: Chunking and Indexing into Qdrant7:57
Advanced Ingestion: SHA-256 Dedup and Metadata5:10
First Hybrid Search: BM25 + Dense Retrieval with Qdrant5:40

Advanced RAG Overview: Metadata Filtering and Agents1:35
Batch Document Ingestion from a Directory with RAGWire6:52
Custom Metadata Schema for Richer Document Extraction8:31
Design a pedantic yaml metadata schema to extract company name, document type, and fiscal year. Configure ragwire prompts for structured output and verify metadata ingestion from documents like a 10-K.
Explore RAGWire APIs and Metadata for Hybrid Search4:32
Explore RAGWire APIs and metadata for hybrid search by discovering metadata fields, filtering options, and structured extraction to build production-grade GenAI apps with agents.
Hybrid Search with Manual Metadata Filtering in Qdrant7:10
Hybrid Search over Complex Nested Data Structures5:09
LLM-Driven Auto Metadata Filtering in Hybrid SearchLLM-Driven Auto Metadata Filt5:37
Agentic RAG Explained: Tools, Memory and Reasoning6:03
Build a Simple Agentic RAG with LangChain and RAGWire10:10
Filter Context Extraction for Better RAG Retrieval5:27
Filter-Aware Agentic RAG: LLM-Guided Hybrid Retrieval5:07

RAGWire Multi-Provider Overview: OpenAI, Groq, Gemini2:18
RAGWire: Multi-Provider LLM and Embedding Setup8:26
OpenAI: RAGWire Setup with GPT and OpenAI Embeddings7:07
Set up OpenAI for RAGWire by creating an API key, storing it in environment variables, and configuring YAML with text embedding three small and LLM models (GPT 5.4 or nano).
OpenAI: Hybrid Search and Batch Ingestion with GPT9:04
Groq: RAGWire Setup for Fast LLM Inference4:49
Groq: Hybrid Search with HuggingFace Embeddings8:46
Gemini: RAGWire Setup with Google Gemini Embeddings7:27
Copy and rename the grok config to Gemini, set the Google API key, and update embedding and llm to Gemini 001 and Gemini models for Ragwire.
Gemini: Hybrid Search and Document Ingestion7:35
Qdrant Cloud: Free Vector DB for RAG Ingestion7:25
Qdrant Cloud: Agentic RAG with Google Gemini10:40

Chainlit RAG Chat UI: Section Overview1:05
Chainlit Intro: Build a Production-Ready RAG Chat UI6:53
Add LangChain Agent Tools to a Chainlit RAG App6:26
Chainlit on_chat_start: RAG Agent Initialization4:26
Integrate Agentic RAG with Chainlit Chat Interface5:16
Chat with RAG Agent via Production Chainlit UI7:55
Run and test the production rag app by launching app.py in the conversational rag chatbot directory with chainlet, and observe memory, vector db access, and UI at localhost 8000.
Document Ingestion via Chainlit Chat UI Part 15:44
Document Ingestion via Chainlit Chat UI Part 24:53
Upload Documents and Chat with RAGWire via Chainlit5:56

FastAPI RAG Backend: Section Overview1:59
Multi-Agent RAG: FastAPI Backend and Chainlit Frontend4:33
OpenAI-Compatible FastAPI Endpoints Explained Part 14:40
OpenAI-Compatible FastAPI Endpoints Explained Part 24:44
LangChain Agent Setup for FastAPI RAG Endpoints5:50
SSE Streaming: LangChain Agent as OpenAI Endpoint6:35
Build OpenAI-Compatible RAG Endpoints with FastAPI10:43
Test FastAPI RAG Agent Endpoints with Postman10:05
Wire OpenAI compatible routes into FastAPI and initialize the Ragwire server. Test health, models, and chat completions with Postman and observe the API responses.
Chainlit Auth and Chat History: RAG App Setup Part 16:05
Chainlit Auth and Chat History: RAG App Setup Part 211:20
End-to-End Testing: RAG Agent and Chainlit Chat App2:58
Chainlit Chat App Response Correction3:16

Multi-Agent RAG Overview: LangGraph, CrewAI, AutoGen1:12
LangGraph Self-Correcting RAG: How It Works5:30
LangGraph RAG Nodes: Write, Retrieve and Generate7:04
LangGraph Self-Correcting RAG: End-to-End Testing5:46
LangGraph self-correcting RAG: explore end-to-end testing by implementing rewrite query, retriever, generate, and conditional ages nodes to route and refine results.
LangGraph Self-Correcting RAG: End-to-End Testing10:16
LangGraph Supervisor Multi-Agent: How It Works7:18
LangGraph: Build a Supervisor Multi-Agent Workflow7:12
LangGraph Supervisor Workflow: End-to-End Testing6:50
search_documents Tool Correction [IMP]0:08
CrewAI Document Assistant RAG Agent Explained5:52
CrewAI: Build a Document Assistant RAG Agent8:52
CrewAI: Build a Multi-Agent Document Analyst6:16
CrewAI Multi-Agent Document Analyst: E2E Testing6:08
Microsoft AutoGen Multi-Agent System Explained5:32
Microsoft AutoGen: Gemini Model Client Setup4:26
Microsoft AutoGen: Build a Research Collaboration Team7:44
Microsoft Agent Framework: Build Your First RAG Agent8:14
Microsoft Agent Framework: How Multi-Agent Workflow Works5:40
Microsoft Agent Framework: Task Specialist Agents6:58
Microsoft Agent Framework: Aggregate Specialist Responses7:06
Microsoft Agent Framework: End-to-End RAG Testing7:28
Learn to design an end-to-end RAG workflow with a synthesizer agent, multiple specialist agents, and an aggregator, then test streaming outputs and deploy the Microsoft multi-agent framework.

Production Deployment Overview: Docker and Render2:36
GitHub Repo Setup for Production RAG Agent Deployment6:00
Deploy your fastapi rag backend to production by creating a separate repository, forking the ragware fastapi rag backend, and preparing minimal requirements and environment variable for renderer, railway, and aws.
Dockerize Your RAG App: Create a Docker Container6:12
Build and Inspect a Docker Image Locally for RAG5:41
Docker .dockerignore: Prevent Credential Leaks in Prod4:24
Deploy RAG Agent to Render: Cloud Deployment Guide5:25
Test Chat Completion API on Deployed RAG Agent4:21
Chat with Your Live RAG App on Render7:52
Deploy and connect a live rag app on render to a local chainlet chat UI by configuring the API URL and running the frontend and backend together.
Secure RAG API with API Key: Production Best Practices6:49
Access Secured RAG API Endpoints with API Key7:21
Chainlit: Secure API Key Access for RAG Endpoints6:41

Requirements

Basic Python programming knowledge (functions, classes, pip)
Familiarity with REST APIs and using a terminal or command line
Basic understanding of Gen AI and Langchain concepts

Description

Retrieval-Augmented Generation (RAG) is at the core of every serious AI application today. But basic RAG pipelines quickly hit their limits when documents are large, queries are complex, or your application needs to run reliably in production.

In this course, you will build RAGWire — a production-grade RAG toolkit built on LangChain, Qdrant, and LangGraph — from the ground up. You will start with a simple hybrid search pipeline and progressively add advanced retrieval, metadata filtering, agentic RAG, multi-agent frameworks, a full chat UI, and multi-cloud deployment.

By the end of this course you will know how to:

Build a hybrid RAG pipeline with BM25 sparse + dense retrieval and Reciprocal Rank Fusion (RRF)
Configure RAGWire with OpenAI GPT, Groq, Google Gemini, Ollama, and HuggingFace embeddings
Implement LLM-driven auto metadata filtering over complex, nested document structures
Build agentic RAG pipelines with LangChain agent tools, memory, and reasoning
Build a self-correcting RAG agent that grades its own retrieval and rewrites queries when quality is low
Build supervisor multi-agent systems that route queries to specialist agents using LangGraph
Build multi-agent document analysts with CrewAI, Microsoft AutoGen, and Microsoft Agent Framework
Build a production Chainlit chat UI with authentication, chat history, and document upload
Build a FastAPI backend with OpenAI-compatible /v1/chat/completions endpoints and SSE streaming
Deploy RAG agents to Render, Railway, AWS ECS Fargate, GCP Cloud Run, and Azure
Secure production APIs with API keys and protect credentials with Docker .dockerignore

This is a hands-on, code-first course. Every section produces working, runnable code that you can adapt to your own documents and use cases.

Who this course is for:

Python developers who want to build production-grade RAG systems beyond basic tutorials
ML engineers looking to deploy LangChain and LangGraph agents to AWS, GCP, or Azure
Developers who want hands-on experience with LangGraph, AutoGen, and CrewAI
Backend developers who want to build OpenAI-compatible FastAPI endpoints for AI applications
AI engineers who want hands-on experience with CrewAI, AutoGen, and multi-agent frameworks
Anyone building document search, enterprise AI assistants, or agentic RAG applications

Advanced RAG: Build & Deploy Production GenAI Apps

What you'll learn

Explore related topics

Course content

Introduction8 lectures • 37min

Introduction to RAGWire RAG Framework4 lectures • 16min

RAGWire RAG Setup and First Retrieval11 lectures • 55min

Advanced Retrieval, Metadata Filtering and Agentic RAG11 lectures • 1hr 6min

RAGWire RAG with OpenAI, Groq, Gemini and Cloud Qdrant10 lectures • 1hr 14min

Real-World RAG: Gym Supplements Use Case with Agentic RAG4 lectures • 19min

Production RAG Chat UI with Chainlit and RAGWire9 lectures • 49min

FastAPI RAG Backend with OpenAI-Compatible Endpoints12 lectures • 1hr 13min

Multi-Agent RAG: LangGraph, CrewAI, AutoGen and Microsoft21 lectures • 2hr 12min

Deploy RAG Agent (AI App) to Production with Docker and Render11 lectures • 1hr 3min

Requirements

Description

Who this course is for: