Build With AI: Python Frameworks and Tools

Name: Build With AI: Python Frameworks and Tools
Rating: 5.0 (2 reviews)

Learn To Create AI Apps in Python, Leveraging APIs, Tools, and Frameworks To Build AI Chat UIs, Vision, and Voice Agents

Created byAmos Gyamfi

Last updated 4/2026

English

What you'll learn

Start building apps with AI
Make apps using closed and open-source AI tools for Python
Learn the fundamentals of Python and AI
Discover and use the best Voice AI platforms and frameworks
Create Python-based AI apps with Deepgram and ElevenLabs
Integrate agents with leading AI services and providers
Learn AI fundamental concepts with OpenAI, Anthropic, Mistral, Gemini, Kimi AI, Qwen, and xAI APIs
Learn AI and agentic voice frameworks for Python, such as Cartesia, Speechmatics, Inworld AI, Decart AI
Build speech-to-text and text-to-speech apps in Python
Add realtime speech and vision AI capabilities to apps
Create vision, voice, and video agents with Gemini
Build AI apps with Gemini Live API
Build AI apps using OpenAI Realtime API
Get started with speech-to-speech model APIs like Amazon Nova Sonic
Discover tools and best platforms for AI agents in Python
Build interfaces for chat, voice, video, and vision agents
Build AI chatbots and assistants in Python
Get started with open-source models and APIs such as Qwen, Kimi AI, DeepSeek, and GPT OSS
Learn to use low-code Python AI frameworks and tools
Learn AI-assisted coding tools and LLMs
Create agents for physical AI use cases, such as restaurant drive-thru
Create LLM-based apps in Python
Add realistic and natural-sounding AI voices to your app

Course content

9 sections • 79 lectures • 7h 28m total length

Introduction0:48
explore python based ai frameworks and tools to build great ai applications, leveraging APIs like OpenAI, Mistral, anthropic XAI, and multi-agent ai frameworks with UI design using Streamlit and Gradio.
Build a realtime video understanding agent with Kimi K2.63:26
Adding a single and multi-line comments in Python2:12
Learn how to add single line and multi-line comments in Python using the hash symbol and triple quotes, explaining code and preventing execution when testing.
Build realtime voice and video apps in Python4:06
Learn to build real-time voice and video apps in Python using the RTC library, fast RTC, WebRTC or WebSocket, and Gradio-powered UIs for low latency multi-modal conversational experiences.
Easiest way to build an MCP server in Python4:36
Discover how to build an MVP server in a few lines of Python using Gradio, turning any function into an MCP server for MCP client apps like kesa or windsurf.
Get started with Gemini API File Search3:59
Get started with the Gemini file search API in Python by uploading documents, converting them to embeddings, and querying with Gemini 2.5 Pro or 2.5 flash to retrieve precise answers.
Get started with Grok Text-to-Speech API: Build voice agents for any use case12:37

Build your first voice and video agent in Python2:58
Create a real-time voice and video AI agent in Python with the Vision Agent framework, configuring uv and env credentials, and using Jetstream with Gemini Live or OpenAI backends.
Build a voice agent with Gemma 4 as LLM2:38
Build a voice agent that can see your surroundings8:16
Build a voice agent that can see your surroundings by streaming real-time audio and video through vision agents, OpenAI and Jetstream, with Python project setup and an interactive web UI.
Using Cartesia Sonic 3 for Text-to-Speech in Voice AI4:28
Explore how Cartesia Sonic 3 enables low-latency text-to-speech in Python voice AI projects, integrated via vision agents, with customizable voice models and multilingual support.
Build an app to make AI phone calls with gpt-realtime-1.58:13
Create an expressive text-to-speech for voice AI using Fish Audio4:04
Create a realistic voice ai app in Python by assembling a custom voice pipeline with speech-to-text, tone detection, and text-to-speech using Vision Agents, Deepgram, smart ten, and Fish Audio.
How to build a GitHub MCP Voice Agent3:54
Build a voice-controlled MCP AI agent in python using vision agents and OpenAI real-time API to perform real-time function calling and interact with GitHub repositories, including issues and pull requests.
Using Ultralytics YOLO for object detection in voice/vision AI2:36
Build a Yoga AI assistant in Python5:10
Build and deploy a real-time yoga AI instructor in Python by integrating vision agents, Gemini Live API, and Ultralytics YOLO for pose detection.
Build an exceptional speech-to-text AI experience with ElevenLabs Scribe v23:11
Learn to integrate ElevenLabs scribe v2 real-time speech-to-text with Vision Agents for real-time transcription and note taking in AI meetings.
How to create a custom AI pipeline for voice agents in Python8:00
Design and implement a custom modular AI pipeline for vision and voice agents in Python, integrating Moon Dream for real-time object detection with Open Router, 11 Labs, and Deepgram.
Build a voice agent using Kimi K2 Thinking as an LLM3:45
Build a python-based voice and vision agent that uses Kimiko thinking via Open Router to detect objects in real time and draw green bounding boxes.
Build a voice/vision app using Gemini 3 and Vision Agents3:43
Build a voice and vision app with Gemini 3 as the LLM in Vision Agents, enabling camera-feed descriptions and answers to questions. Configure thinking level and media resolution for tasks.
Build Maths and Physics voice AI tutor with DeepSeek v3.24:59
Create a maths and physics voice AI tutor in python using deep seek v3.2 models on open router, via vision agents with speech-to-text and text-to-speech, swapping llms with a plugin.
Electronics setup AI agent in Python5:28
Build an electronic setup and repair voice assistant in python using vision agents and base ten, and follow a camera demo that guides battery, memory card, lens, and power-on testing.
Build a drive-thru restaurant ordering system in Python10:28
Build a drive-thru AI ordering system in Python by integrating voice and vision AI with Gemini live speech to speech and Vision Agents for real-time, low-latency orders.
Build a Vision agent with Kimi K2.53:59
Learn to build a vision agent with KimiK 2.5 in Python using Vision Agents, integrating speech-to-text, text-to-speech, and OpenAI chat completions for real-time video and voice AI.
Create a speech-to-text app with Mistral Voxtral Transcribe 23:29
Build a real-time speech-to-text app using Voxtra Transcribe 2 with Vision Agents, Mixtra models, and DeepGram text-to-speech to enable low-latency voice and video transcription.
Build an app with Amazon Nova Sonic spech-to-speech2:58
Learn to build a real-time voice agent in Python using Amazon NovaSonic speech-to-speech, set up Vision Agents, GetStream, and AWS credentials to run a story-telling demo.
Use Hume AI's TADA TTS for realistic speech synthesis4:16

Build your first AI agent using the OpenAI Agents SDK6:15
Build production-ready AI agents in Python with the OpenAI Agents SDK, using tools and orchestration. Set up the environment and create your first agent with the agent and runner APIs.
Local DeepSeek R1 agent using Ollama, Streamlit, and OpenAI Agents SDK8:51
Learn to build and run a local Devcheck R1 agent with the OpenAI agents SDK, using ollama and a streamlit interface on localhost, with performance traces via the OpenAI dashboard.
Build UIs for your AI agents10:15
Build UIs for OpenAI agents in Python using the OpenAI agents SDK to orchestrate multi-agent workflows with LLMs, tools, and guardrails. Explore Generate, Gradio, and Streamlit for interactive interfaces.
How to build a voice AI agent4:50
Learn to build a voice AI agent in python using the OpenAI agents SDK, converting a defined workflow into a voice app with speech-to-text and text-to-speech.
How to create a file system MCP agent in Python4:35
Build a file system MCP agent in Python using GPT 4.1 and the OpenAI agents SDK to chat with files in a directory through a Streamlit interface.
How to create your first agent with Agent Development Kit6:37
Launch your first agent with Google's agent development kit for Python by setting up a virtual environment, installing the sdk, and building an agent with weather and time tools.

Build a simple AI chat UI with Kiro AI IDE8:49
Explore building an AI chat UI using the Cairo AI IDE, a VSCode-like environment, to code with the Cmyk two model from Moonshot and a Streamlit interface for Python apps.
Build an AI chat UI for DeepSeek R1 and Ollama integration6:24
Build a local AI chat UI in Python with Streamlit and llama to run the Dopesick R1 model offline via Ollama, featuring a simple input and typewriting streaming output.

AI music generation using Lyria 3 and Gemini API17:14
Get started with file search in the Gemini API3:59
Learn to use the Gemini file search API in Python, which provides built in retrieval augmented generation to upload documents, create embeddings, and query with a vector store.
Build voice AI apps with the new OpenAI voice models5:20
Learn to build voice AI apps with OpenAI's text-to-speech model using the response API, test multiple voices, and create a Streamlit UI to switch models and run Python examples.
Build a voice app with the OpenAI Agents SDK3:28
Build a real-time weather-aware voice assistant in Python using the OpenAI agents SDK, wiring a voice pipeline—speech-to-text, LLM, text-to-speech—with a weather tool and Finnish language support.
OpenAI API: Build an AI agent with GPT 4.54:49
Explore building an AI agent using GPT-4.5 via the OpenAI API, with the Arduino Python framework, DuckDuckGo tooling, and a setup that fetches latest web information.
AI Image Generation with DALL-E7:37
Learn to generate images from scratch and create variations using OpenAI's DALL-E 3 and DALL-E 2 in a Python app, covering prompts, image sizes, quality, and styles.
How to turn text into realistic spoken audio6:04
Learn how to turn text into lifelike spoken audio using OpenAI's text to speech API in Python, including multilingual prompts and real-time streaming options.
OpenAI Swarm: A Quick Start Guide7:10
Build AI agents with OpenAI swarm, an experimental framework that uses LLMs powered assistants and tools to perform tasks. Install, configure two agents A and B, and run.
Get started with Structured Outputs6:17
Explore structuring llm outputs with the OpenAI API using JSON schemas, including function calls and response formats, and define a Pydantic calendar event to constrain fields.
Moderate content with structured outputs5:41
Explore using structured outputs and a JSON schema to moderate text with OpenAI's API, defining a Pydantic-based category taxonomy (violence, sexual, self-harm) and testing phrases.
Web search in Anthropic API2:34
Explore how the Anthropic API web search tool delivers real-time web content via cloud models such as cloud 3.7 and 3.5 sonnet, and how to call it with Python.
Getting Started With xAI: Make Your First API Call4:44
Get started with the Xai API by creating an account, generating an API key, and making your first curl call with grok beta model. Export the key and monitor usage.
How to generate images with Grok2:31
Access image generation on X with grok's multimodal interface, paste and modify prompts, and compare grok outputs with Flux Run dev on Hugging Face, ideogram 2, and Dall E 3.
Live search with Grok 49:47
Learn everything about live search using Grok 4
Run DeepSeek R1 offline with Ollama3:48
Learn to run the Devcic AR1 reasoning model offline with Olama, selecting from distilled versions, and test a physics prompt that derives six meters per second.
Use Kimi K2 via OpenAI API7:21
Explore Kimi K2 via its website, playground, and application programming interface to test a mixture-of-experts model with 32 billion activated parameters and 1 trillion parameters, excelling in math and coding.
How to create a DeepSeek R1 agent in Python5:08
Build a high-speed python ai agent with grok and agno, selecting the r1 distill llama 70b model, installing dependencies, and running a local playground interface.
Gemini CLI quick start2:33
Gemini CLI brings a free, open-source coding agent to your terminal, letting you install it globally, log in with Google, and query repo updates or generate SwiftUI code with PencilKit.
Build your first AI App with Gemini 2.5 Pro11:39
Learn to set up Gemini 2.5 Pro via the Google API, load your API key with Python, and build apps using a SwiftUI animation and a Streamlit UI.
How to make an API call with Gemini 2.0 Flash2:31
Get started with the Google Gemini API and make your first API call using Gemini 2.0 flash by installing the SDK, acquiring an API key, and running a Python script.
Generate and edit images with Gemini 2.5 Flash Image Preview5:50
Learn to generate and edit photorealistic images with Gemini 2.5 image preview via the API and Python, including blending photos and Photoshop-like edits.
How to run DeepSeek R1 locally/offline using LMStudio4:29
Learn to run the Devcic R1 thinking model offline with LM Studio, download models from HuggingFace, and watch it solve a physics problem with detailed intermediate reasoning steps.
Monitor an Agent's operation using Prometheus2:58
Monitor a vision agent's operation with Prometheus and OpenTelemetry, capturing LLM, speech-to-text, and text-to-speech metrics. Set up a Prometheus server and visualize metrics locally on port 9464.

Using Agent Skills Antigravity3:06
Add agent scales in Antigravity using the scales folder with scale.md and optional resources. Create and access workspace and global scales, and run commands to manage their locations.
Build your first AI agent in Python5:10
Learn to build your first AI agent in Python with the Pydata framework in three lines of code, using a system prompt and print response to run and test.
How to capture an AI agent's response into a variable1:43
Learn how to capture an AI agent's response in a variable using run response and streaming, so you can pass it to the front end or another agent.
Build Multi-Agent AI Apps With Phidata11:53
Build AI agents in Python using Fei data. Create web and finance agents and assemble multi-agent systems with open and closed LLMs and custom tools.
How to build an agentic RAG system with Python and Phidata5:29
Create a retriever AI agent and agent rack in Python with Filedata, wiring Lance DB, Tantivy, PDF library, and SQLAlchemy to pull and analyze data from PDFs and other sources.
Build an AI agent for vector search and chat with your PDF10:13
Build an ai agent for vector search and chat with your pdf documents using Pydata, OpenAI API, and langs db to create a searchable knowledge base.
Build your first computer-use agent in Python6:37
Build a Python computer-use agent with the browser-use library to automate browser actions. Guide the agent to search flights and find the cheapest option using GPT-4 and an OpenAI key.
Install and build your first AI agent with CrewAI7:27
Install and start your first multi-agent ai project with Gruyere, select providers, set up api keys, and run the research and data analyst agents.
Build a computer using-using AI agent with Trae.ai and Browser Use7:50
Demonstrates building an AI agent with Browser Use to automate web tasks in Python, including setting up a virtual environment and finding the cheapest flight from Helsinki to San Francisco.

Write, edit and run Python code with ChatGPT Canvas1:35
Learn to edit and run Python code using ChatGPT canvas in the web version of ChatGPT. Discover how to access canvas via view tools, collaborate on writing, and add comments.
Using Cursor AI Code Editor for Swift and SwiftUI projects4:09
Generate 86 iOS buttons from an image using Casa AI code editor for SwiftUI projects, then apply button styles and shapes in Swift and share the boilerplate on GitHub.
Get started with AI-assisted coding in Xcode and Swift4:18
learn ai assisted coding in swift with alex for an integrated xcode workflow. generate swift code from images, chat with your codebase, and add files seamlessly.
Generative SwiftUI Animation With Cursor AI Code Editor13:23
Learn to create and animate progress rings in SwiftUI using the AI code editor and Claude 3.5 sonnet, crafting prompts to generate and refine SwiftUI views.
Build an iOS 18 Calculator With Cursor and SwiftUI5:47
build a fully functional ios 18 calculator using casa cloud 3.5, sonnet, and swiftui, with division, multiplication, subtraction, addition, and equals, generated from a screenshot and refined in code.
How to install and run the Moshi AI speech model5:02
moshi is the first real-time multi-stream speech language model that you can interrupt during conversation, install locally with pip, download from huggingface, run server, and test via web ui.
How to run gguf models locally with Gradio4:34
Learn to run gguf models locally with a Gradio interface by downloading a model from Lama Comm or Hugging Face and using the llama cpp app to stream responses.
Introduction to image generation and video segmentation models7:02
Explore image generation and video segmentation with Sam two, demonstrating object tracking, background removal, and demos from Hugging Face Spaces and API workflows.

Create your first FastHTML app6:22
Create your first fast HTML app in Python with VS Code, building a simple form and a hello world page using Python objects that render HTML.
Python FastHTML: How to add a video player and CSS styling7:00
Learn to add a standard HTML video player to a first HTML app with pure Python, and style the page with inline CSS and a CSS style component.
Working with SVG Graphics in FastHTML6:44
Learn to load svg graphics into a first html app from the svg repo using SF symbols, material symbols, and converted svg code with fast html.

Requirements

This course is designed for anyone wanting to start AI app development in Python. It serves as your first step in creating AI apps and agents with Python, explaining all foundational concepts in more detail using plain English. It serves as an entry point to major AI APIs, including OpenAI, Anthropic, Gemini, xAI, Mistral, Kimi AI, Qwen, and prominent voice AI platforms such as Vision Agents, Deepgram, and ElevenLabs. I will use a Mac computer, Cursor, and API subscriptions from several model providers for most tutorials. However, students can follow along with a Windows or Linux computer and their preferred Python code editor, such as VS Code or Windsurf. I will mention the Python tool or framework required in every tutorial and provide instructions on how to obtain the necessary API credentials to complete the tutorial.

Description

AI is rapidly transforming how we approach everything, from learning to code to building apps to solving complex problems. Over the past three years, I have written AI-related articles, such as "The 6 Best LLM Tools To Run Models Locally," on Medium, and created several AI content on YouTube. Many people, including students and developers, have asked me how to start building AI apps, specifically in Python.

In this course, I will guide you in learning fundamental AI concepts (Retrieval Augmented Generation, Fine-tuning, Embeddings, AI-native Vector Databases) to build AI apps, agents, and chat interfaces.

Join this course, and let's start building AI for video, vision, voice/speech, and more. You will discover and start creating AI agents using the best and easy-to-use Python frameworks, such as OpenAI Realtime, OpenAI Agents SDK, and Vision Agents. You will also learn to utilize APIs from OpenAI, Anthropic, Mistral, Meta AI, Kimi AI, Qwen, DeepSeek, and xAI for agentic app creation, as well as for image, video, audio/voice, and text generation. By following all the tutorials in this course, you will understand the various concepts in AI and how to implement them in actual AI-related projects. In addition, you will be familiar with many Python-based libraries and web frameworks for creating AI apps.

Who this course is for:

Anyone who wants to start building apps with AI in Python
Anyone who want to learn AI agent frameworks in Python
People looking to start how to build voice, video, and vision agents
Beginners Python learners looking to build AI apps and UIs for voice and chat
Developers who want to use efficient AI tools in building apps
Anyone who wants to learn AI concepts like LLM, RAG, Fine-Tuning, voice pipelines and more
Total beginners who want to start coding AI chat and audio interfaces using low-code Python frameworks
People who want to begin AI app development with Python
Anyone who wants to begin AI-assisted coding
Those starting their AI learning journey
Those starting their AI app development in Python
College students
Independent developers
Hobbyists interested in learning AI with Python

Build With AI: Python Frameworks and Tools

What you'll learn

Explore related topics

Course content

Get Started7 lectures • 32min

Build Vision, Voice, and Video AI Agents in Python20 lectures • 1hr 37min

Agents SDKs: Humanity's First School of AI Agents6 lectures • 41min

Build UIs for AI Chat2 lectures • 15min

Learn AI APIs: OpenAI, Anthropic, xAI, Gemini, DeepSeek, Mistral23 lectures • 2hr 14min

Build AI Agents9 lectures • 59min

AI-Assisted Code Editors8 lectures • 46min

Local LLM Tools1 lecture • 5min

FastHTML Foundations3 lectures • 20min

Requirements

Description

Who this course is for: