
Introduction to the course + roadmap + Readaheads
In this lecture, we examine the recent inflection point in AI engineering — the shift from API-based consumption to local and open model ownership.
Over the past two years, open-weight frontier models, local inference tools, and rapidly expanding ecosystems like Hugging Face have fundamentally changed what engineers can control.
By the end of this lecture, you will be able to explain how this shift impacts deployment strategy, architectural responsibility, and the transition from experimentation to system-level engineering.
For most of the AI boom, developers interacted with models through remote APIs. Intelligence lived somewhere else — behind a hosted service, controlled by a provider.
In this section, we cross a major architectural threshold: running modern models locally.
Tools like Ollama make it possible to install and run powerful language models directly on your machine, shifting inference from remote infrastructure to developer-controlled environments.
This seemingly simple step changes the entire engineering landscape — enabling experimentation, model choice, and system design that were previously restricted to large organizations.
In this lecture, we move from running a model locally to integrating it into a real development workflow.
You will:
Install OpenCode / Codex tooling in Visual Studio (Windows or Linux).
Connect your local Ollama instance to the assistant environment.
Configure the assistant to use local models instead of default cloud models.
Switch between models and observe how the interface interacts with the underlying inference engine.
This exercise highlights an important architectural concept: the interface and the model engine are separate systems. By connecting local inference to your development tools, you gain control over model selection, execution, and experimentation.
In this lecture, we explore the other side of rapid AI tooling: when powerful systems become easy to deploy before their risks are fully understood.
You will install and run OpenClaw, an agentic framework designed to automate complex tasks using AI-driven workflows. Within minutes, you will see how quickly these systems can begin interacting with external services, executing commands, and coordinating multiple operations.
Through this exercise, we examine a growing pattern in modern AI tooling:
extremely powerful frameworks
minimal setup time
very little operational visibility
Using OpenClaw as a case study, we discuss how quickly automation can extend beyond its intended boundaries — and why many exposed deployments have already appeared on the public internet.
This lecture highlights the core tension of modern AI infrastructure:
Installation is easy. Judgment is not.
AI Engineering in 2026 is no longer just about prompts — it’s about building AI Agents, RAG pipelines, and production-ready LLM systems.
This course is designed to be hands-on. Instead of just explaining AI concepts, we’re going to install tools, run models locally, and experiment with the systems that power modern AI engineering.
In this course, you’ll move from using tools like ChatGPT to engineering real AI architectures with agents, RAG, structured outputs, and hybrid routing that combine local models, cloud APIs, RAG pipelines, and agentic workflows.
You’ll start by running your own local LLM and validating exactly how it communicates. From there, you’ll build a simple AI assistant and then progressively evolve it into a structured, observable system.
You’ll learn how to:
Build AI Agents with controlled execution loops
Implement reliable RAG (Retrieval-Augmented Generation) pipelines
Enforce deterministic outputs using structured schemas
Separate interface, engine, routing, and memory into clear architectural layers
Design hybrid AI systems that combine local and cloud models
Evaluate AI frameworks based on system design rather than hype
This course is designed as a practical AI engineering course for developers who want to understand what happens between “prompt” and “production” in real-world systems.
If you’ve experimented with ChatGPT or LLM APIs and want to move toward building scalable, production-ready AI systems with confidence and clarity, this course is for you.
By the end, you won’t just be using AI tools — you’ll be designing reliable, observable, production-ready AI systems you actually understand.