The Complete Agentic AI Engineering Masterclass (2026)

Name: The Complete Agentic AI Engineering Masterclass (2026)
Rating: 4.7 (50 reviews)

Build Autonomous AI Agents using ADK, LLM, RAG, Tools, MCP, Memory, Orchestration with real world capstone projects

Created byJeiwin Bhamangol

Last updated 5/2026

English

What you'll learn

Run LLMs locally (e.g. Ollama, LM Studio, Hugging Face) to build and develop AI applications entirely on your own machine.
Create RAG systems integrating embeddings, vector stores, and local LLMs for efficient knowledge retrieval.
Build agentic systems where smart agents use tools and workflows to autonomously accomplish tasks.
Implement prompt engineering, context management, and guardrails to control agent behavior and ensure reliability.

Course content

15 sections • 35 lectures • 3h 54m total length

What is AI Model?8:22
This video provides an academic overview of Artificial Intelligence (AI) models, explaining what they are, how they are trained, and the diverse architectures that power them. An AI model is essentially a mathematical representation of patterns learned from data. Training involves exposing the model to large datasets, adjusting internal parameters (weights and biases) through optimization algorithms such as gradient descent, and minimizing error functions to achieve accurate predictions or meaningful outputs.
We then explore the major architectures that form the foundation of modern AI:
Convolutional Neural Networks (CNNs): Specialized for spatial data such as images, using convolutional layers to detect hierarchical features like edges, textures, and objects.
Autoencoders: Neural networks designed for unsupervised learning tasks, compressing and reconstructing data, widely applied in dimensionality reduction, anomaly detection, and generative tasks.
BERT (Bidirectional Encoder Representations from Transformers): A landmark language model that introduced deep bidirectional contextual understanding in NLP by leveraging the transformer architecture.
Transformers: General-purpose sequence models that rely on self-attention mechanisms, forming the backbone of state-of-the-art natural language, vision, and multimodal AI systems.
Diffusion Models: Probabilistic generative models that iteratively learn to reverse noise processes, enabling high-fidelity image, audio, and video generation.
By the end of this video, viewers will gain a structured understanding of how AI models are conceptualized, trained, and architected, along with insights into the evolution from traditional neural networks to advanced generative and transformer-based systems.
What is Generative AI?5:56
This video provides a comprehensive overview of Generative AI (GenAI), focusing on how it works, the principles behind content generation, and the popular models driving today’s AI applications. Generative AI refers to models capable of creating new text, images, audio, or code by learning the underlying patterns and structures from vast datasets. During training, these models capture statistical relationships within data and then use probabilistic techniques to generate outputs that are coherent, contextually relevant, and often indistinguishable from human-created content.
We explore the mechanism of generation, where a model takes an input prompt and, based on its learned representations, produces novel content by predicting the most probable next element—whether that is the next word in a sentence, the next pixel in an image, or the next sound in audio synthesis.
The video also introduces popular Large Language Models (LLMs) and their role in modern AI systems:
OpenAI’s GPT family (e.g., GPT-4): Known for their versatility in conversation, reasoning, and creative text generation.
Google’s PaLM and Gemini models: Optimized for multilingual tasks, reasoning, and integration into Google’s ecosystem.
Anthropic’s Claude: Focused on safe, steerable AI interactions with emphasis on ethical alignment.
Meta’s LLaMA models: Open-weight LLMs enabling research and practical applications across academia and enterprises.
Mistral and other emerging models: Lightweight, efficient architectures optimized for performance in enterprise use cases.
We then discuss practical applications of these models, from chat-based assistants and content generation tools to code completion, scientific research, and enterprise automation.
By the end of this session, viewers will understand what makes Generative AI distinct, how it creates content based on trained patterns, and the landscape of leading LLMs shaping today’s AI-powered applications.
Knowledge Check

Traditional Engineering vs Generative Ai Engineering4:39
This lecture examines the contrast between Traditional Engineering and Generative AI Engineering, focusing on how their underlying principles, tools, and outcomes differ. Traditional engineering is rule-based and deterministic, relying on fixed logic, databases, frameworks, and testing to build predictable applications such as calculators, blogs, and media platforms. In contrast, generative AI engineering is data-driven and probabilistic, leveraging prompts, embeddings, vector databases, and large language models (LLMs) to produce context-dependent, conditional outputs.
We will analyze how these differences manifest across key aspects:
Approach: Deterministic rules vs. probabilistic, data-driven reasoning
Patterns: CRUD/API/DevOps vs. data, models, embeddings
Rules: Hard-coded logic vs. prompt engineering and parameter tuning
Outputs: Predictable/fixed vs. probabilistic/conditional
Tools: Frameworks, databases, Postman vs. LLMs, HuggingFace, vector DBs, evals
Domains: Classical apps and directories vs. NLP, AI assistants, classification, and prediction tasks
By the end of this lecture, learners will clearly understand how Gen-AI Engineering extends beyond traditional software practices, redefining workflows, toolchains, and solution spaces.

How LLM Process User Input11:59
This lecture unpacks the inner workings of Large Language Models (LLMs), breaking down the key components that enable them to understand and generate human-like text. We begin with tokenization, where input text is split into smaller units (tokens) for efficient processing. These tokens are then transformed into embeddings, high-dimensional numerical vectors that capture semantic meaning and relationships between words.
Next, we explore the self-attention mechanism, the core innovation behind transformers, which allows models to weigh the importance of different tokens in a sequence and capture contextual relationships across entire texts. Finally, we examine the prediction process, where the model generates outputs step by step by selecting the most probable next token based on learned patterns from its training data.
By the end of this lecture, learners will understand the step-by-step pipeline of LLMs—from raw text input to meaningful predictions—and how these components work together to power state-of-the-art AI applications.

How RAG Works7:01
This lecture focuses on Retrieval-Augmented Generation (RAG), a powerful technique that combines information retrieval with generative AI to produce accurate, context-aware outputs. We begin by explaining the RAG workflow: a user query is transformed into embeddings, relevant documents are retrieved from a vector database, and the retrieved context is passed to a language model to ground its response in reliable knowledge.
We then explore the open-source ecosystem that enables RAG, including:
Vector Databases (e.g., FAISS, Milvus, Pinecone, Weaviate) for efficient semantic search.
Embedding Models (e.g., OpenAI, HuggingFace, SentenceTransformers) to convert text into dense vector representations.
Frameworks and Orchestration Tools (e.g., LangChain, LlamaIndex, Haystack) for building RAG pipelines and integrating retrieval with LLMs.
By the end of this lecture, learners will understand how RAG enhances the reliability of LLMs, the key steps in building a RAG pipeline, and the diverse open-source frameworks available to implement it effectively.
When to use LLM Finetuning3:57
This lecture explores the decision-making process between Fine-tuning and Retrieval-Augmented Generation (RAG) for adapting Large Language Models to specific tasks. We begin by explaining fine-tuning, where a model’s weights are updated on domain-specific data to specialize it for tasks such as classification, sentiment analysis, or domain-restricted text generation. Fine-tuning is best suited for structured, repetitive, or highly specialized tasks where knowledge does not change frequently.
In contrast, RAG is introduced as a more flexible approach that augments a base model with external, retrievable knowledge. RAG is most effective when working with dynamic, evolving information or large knowledge bases, as it avoids retraining and enables models to stay current.
Through real-world examples, we illustrate:
Fine-tuning for tasks like legal document classification, medical report tagging, or customer intent detection.
RAG for applications such as enterprise search, question answering, or chatbots requiring access to large and constantly changing datasets.
By the end of this lecture, learners will clearly understand the trade-offs between fine-tuning and RAG, and how to choose the right strategy for different problem domains.

Opensource AI Ecosystem & Platforms2:40
This section explores the open-source ecosystem that empowers developers, researchers, and enterprises to build end-to-end AI solutions without being locked into proprietary systems. We examine the role of platforms that provide pretrained models, libraries for fine-tuning and inference, and toolkits for orchestrating workflows. Key focus areas include:
Model Hubs and Frameworks (e.g., Hugging Face, PyTorch, TensorFlow) that provide access to pretrained models and development libraries.
RAG and Orchestration Frameworks (e.g., LangChain, LlamaIndex, Haystack) that simplify building pipelines around LLMs.
Vector Databases and Embedding Tools (e.g., FAISS, Milvus, Weaviate, Chroma) that enable semantic search and retrieval.

Huggingface AI Models Repository4:36
The Hugging Face Model Hub is a central repository that hosts thousands of pretrained AI models across domains such as natural language processing, computer vision, speech, and multimodal tasks. It enables researchers and developers to share, discover, and reuse models, reducing the time and computational cost required to train models from scratch.
The repository supports a wide range of architectures, including BERT, GPT, T5, CLIP, Stable Diffusion, and domain-specific models fine-tuned for tasks like sentiment analysis, machine translation, summarization, or image classification. Each model entry typically includes documentation, usage examples, licensing details, and community-driven updates.
A key strength of the Hub is its interoperability with Hugging Face libraries, allowing developers to easily load and deploy models with just a few lines of code. Additionally, the repository fosters an open research community, where contributors publish checkpoints, benchmarks, and evaluations, driving rapid innovation and transparency in AI development.
Huggingface Opensource Finetuning Datasets4:14
In this lecture, you will discover how the Hugging Face Datasets ecosystem supports different types of fine-tuning datasets and how its built-in Data Studio (Dataset Viewer + SQL console) helps you inspect, filter, and query dataset content—all before downloading or processing it.
You’ll learn:
Common dataset formats and examples used in fine-tuning (e.g. instruction/response, classification, image/text pairs)
How to explore dataset structure, metadata, and distributions via the Hugging Face Dataset Viewer
Using filtering, searching, and histograms to understand class balance, missing values, ranges, and data types
Running SQL queries directly inside the Dataset Viewer (via the SQL Console) to slice, transform, or extract subsets of data Hugging Face
How large datasets are auto-converted to Parquet for efficient querying in Data Studio, and the constraints (e.g. 5GB view limits) Hugging Face+2Hugging Face+2
Best practices when selecting or preparing datasets for fine-tuning based on what you observe in Data Studio
Huggingface Spaces (Image generation & Music Generation)6:44

Requirements

A desktop or laptop with internet access for hands‑on projects, basic Python knowledge is plus

Description

This course, takes you from the fundamentals of AI models to building and deploying Intelligent AI agents using the latest Generative AI Framework and LLM-powered architectures. Designed for professionals, developers, and innovators, this program blends theory, practice, and hands-on insights.

Over the course, you’ll explore:

Foundations of Generative AI — Dive deep into CNNs, Transformers, Diffusion Models, VAEs, and how modern generative systems produce new content.
Traditional vs Agentic AI Engineering — Understand the shift from static models to reactive agents, and learn why agentic frameworks are the future.
How LLMs Work — Unpack tokenization, embeddings, self-attention, layers, prompts, and the reasoning pipelines behind GPT-style models.
RAG & Fine-Tuning — Learn when to fine-tune versus retrieval, build vector‐based memory systems, and integrate retrieval-augmented generation (RAG) workflows.
Local LLM Deployment — Deploy open-source models like LLaMA, Mistral, and Alpaca on your own infrastructure for security, flexibility, and scale.
Hugging Face & Open-Source Ecosystem — Leverage the Hugging Face Model Hub, datasets, pipelines, and tools to accelerate development.
Agentic AI Projects (Hands-On) — Build independent agents, research assistants, Q&A systems, planning agents, and multi-agent pipelines.
Containerization & Cloud Deployment — Package agents with Docker, Kubernetes, or serverless architectures to deploy them reliably in production settings.
Scaling, Monitoring & Maintenance — Learn how to monitor agent performance, handle errors and fallback mechanisms, manage versioning, and scale gracefully.

By the end, you’ll be able to design, fine-tune, and deploy agentic AI systems confidently using Generative AI frameworks.

Who this course is for:

Beginners & Non‑Technical Learners: Eager to explore the world of Agentic AI, with no prior experience required.
Software Engineers & AI Developers: Seeking to build, deploy, and scale autonomous AI agents using frameworks like LangChain, LangGraph, and Ollama.
Data Scientists & Technical Professionals: Aiming to gain hands‑on experience with state‑of‑the‑art agentic frameworks and real‑world AI solutions.
Product Managers & Business Professionals: Looking to understand and lead AI projects, collaborate with AI teams, and drive business value using AI agent solutions.
Entrepreneurs & Small Business Owners: Interested in integrating AI agents into their products or automating tasks using no‑code platforms like LangFlow.

The Complete Agentic AI Engineering Masterclass (2026)

What you'll learn

Explore related topics

Course content

Introduction to AI Models2 lectures • 14min

Traditional Engineering vs Generative AI Engineering1 lecture • 5min

How LLM Works?1 lecture • 12min

Finetuning vs RAG (Retrieval Augmented Generation)2 lectures • 11min

Deploy Opensource LLM Locally2 lectures • 12min

Opensource AI Ecosystem1 lecture • 3min

Huggingface Ecosystem3 lectures • 16min

First AI Agent using Agent Development Kit (ADK) Hands On2 lectures • 10min

Multi Agent Architecture Patterns Hands On5 lectures • 34min

Agent Tools & Interoperability with Model Context Protocol (MCP)4 lectures • 39min

Requirements

Description

Who this course is for: