
In this video, we discuss the prerequisites for the course, which focuses on advanced technology and sophisticated GenAI agents using LangGraph. It's crucial to have a strong foundation in certain areas to keep up with the course content, as we won't be covering basic topics.
Python Proficiency:
Familiarity with concepts like:
Managing and storing environment variables using a .env file.
Package management tools such as Poetry, Pipenv, or Virtualenv.
Configuring the IDE with the interpreter.
Debugging and running files in the IDE.
Object-oriented programming.
Git version control.
Assumption of knowledge in these areas to maintain focus on LangGraph and advanced topics.
LangGraph and LangChain:
Overview:
LangGraph is an extension of the LangChain framework, tailored for building complex agent flows.
Use of LangChain is necessary to work with LangGraph.
Course Content:
Utilizes LangChain objects like prompt templates, chains, and possibly LangChain Expression Language.
Familiarity with these topics is beneficial.
Recommendation to check out a prior course covering these foundational topics if unfamiliar.
Ideal Students:
Proficiency in Python and LangChain is essential.
The course is challenging for those not comfortable with the mentioned topics.
Recommendation to reconsider taking the course if lacking proficiency in these areas.
To conclude, this course is designed for those with a solid understanding of Python and LangChain. If you meet these prerequisites, you'll be well-prepared to dive into the advanced concepts of LangGraph and build sophisticated agentic applications.
In this video, we introduce the topic of LangGraph, explaining its purpose and how it differs from LangChain. We highlight the advancements and flexibility of LangChain in building generative applications.
LangChain Overview:
Features:
Suitable for building DAG applications and agents.
Improved security, flexibility, readability, and usability.
Uses LangChain Expression Language for composability and convenient interaction with components.
Limitations:
Challenges in building complex agentic systems.
Autonomous agents have freedom but are not yet production-ready or highly usable.
Regular LLM calls are limited in complexity and control.
Router chains or agents can decide steps using LLMs but cannot create cycles.
LangGraph:
Introduction:
LangGraph addresses limitations by enabling the implementation of cycles in agents.
Provides an additional dimension of freedom and complexity.
Capabilities:
Allows defining flows with nodes and edges, including cycles.
Important for building complex agents with more freedom.
Integrates with flow engineering to define and control program flows.
LLMs can assist in deciding the flow direction (e.g., flow A, flow B, finishing, or restarting).
Implementation:
Elegant and easy to implement advanced solutions using LangGraph.
Entire logic and flow can be expressed as a graph with cycles, enhancing convenience and capability.
Conclusion: We emphasize the convenience and advanced capabilities of LangGraph in developing sophisticated agentic applications. We encourage you to explore the course to see practical implementations of LangGraph.
Graphs:
Definition:
A graph is a mathematical object that represents relationships.
Consists of nodes (vertices) and edges that connect the nodes.
Applications:
Used in various fields, such as social networks, transportation maps, and cloud security.
Helps solve real-world problems through algorithms and property extraction.
Formal Definition:
A graph G is comprised of V (vertices) and E (edges), where an edge is a pair (x, y) belonging to the vertices set.
State Machines:
Definition:
A model of computation consisting of states and transitions between states.
Defines different states and rules for transitions to manage complex conditions and sequences in software systems.
Graph Representation:
State machines can be visualized as graphs, with states as nodes and transitions as edges.
This helps in understanding and managing the complexity of state machines.
LangGraph:
Overview:
A powerful library built on top of LangChain.
Allows describing flows using nodes and edges.
Capabilities:
Enables building sophisticated agentic applications.
Facilitates writing and running advanced agents in LangGraph.
Flow Engineering Overview:
Systematic and strategic approach for developing AI-driven software.
Manages and optimizes AI systems with LLMs by defining clear flows or sequences of operations.
Involves complex decision-making nodes where AI may generate multiple outputs, often refined iteratively.
Goals of Flow Engineering:
Guides AI through well-defined steps to improve output quality.
Incorporates systematic planning and testing phases mimicking human development processes.
Enhances reliability and functionality of AI-generated solutions.
Challenges with Autonomous Agents:
Projects like auto-GPT and baby AGI struggle with long-term planning.
AI creating and executing tasks autonomously can lead to problems.
Developers need to define tasks and ensure AI stays within the task context.
Developer's Role:
Developers define the scope and plan for LLMs.
LLMs can make decisions about task readiness and subsequent steps within the defined flow.
Developers provide a blueprint for LLMs to follow, similar to a state machine where developers define the states and steps.
LangGraph and Flow Engineering:
LangGraph as an intermediate solution between fully autonomous agents and fully deterministic chains.
Allows building complex solutions by defining state machines and incorporating LLMs for specific tasks or decision-making.
Graph Components in LangGraph:
Nodes and edges, with the ability to include cycles.
Advanced logic can be built for complex AI systems.
Example: Creating a tweet, refining it iteratively using LLMs until achieving a high-quality post.
Future of AI Software Development:
Development time distribution:
60% on flow engineering and architecture of state machines.
35% on fine-tuning models for specific tasks.
5% on prompt engineering.
LangGraph Core Components:
Nodes:
Python functions that can contain any code, including LLM calls or agents.
Edges:
Connect nodes within the graph's execution.
Conditional Edges:
Help in making dynamic decisions within the graph's execution.
Special Nodes:
Start Node:
Entry point for the graph's execution.
End Node:
Exit point for the graph's execution.
Both nodes act as no-operations (no-op).
State or Agent State:
A dictionary storing important information for the graph.
Can store node execution results, temporary results, or chat history.
Available for all nodes within the graph.
Can be made persistent for robust and fault-tolerant software.
Node Functions:
Always receive the current state as input.
Return an updated state, ensuring the state evolves over time.
Advanced Concepts:
Cyclic Graphs:
Enable loops within the graph.
Human-in-the-Loop:
Allows for dynamic decision-making with human feedback.
Persistence:
Allows storing and retrieving graph states, enhancing robustness and user experience.
Lesson Summary Bullet Points
Objective: Build a "Reflexion Agent," an advanced agent architecture that improves its responses iteratively.
Core Idea: Extends the basic reflection concept by incorporating external tools, specifically a search engine (Tavily), to fetch real-time data and ground its responses.
Architecture Overview:
User Request: The process starts with a user query.
Responder Node: Generates an initial response, a critique of that response, and suggested search queries to improve it.
Execute Tools Node: Takes the suggested search queries and uses a search engine (Tavily) to retrieve relevant, real-time information.
Revisor Node: Takes the initial response, its critique, and the fetched search results. It revises the original response based on this combined information, incorporating external data and addressing the critique. It also generates new critiques and new search queries for the next iteration, along with citations for the information used.
Loop: The process (Execute Tools -> Revisor) repeats N times or until a stopping condition is met, refining the answer with each iteration.
End: The final revised response is returned to the user.
Key Enhancements:
Tool Use (Search): Uses Tavily search engine to ground answers in current, external information.
Advanced Prompting: Employs techniques to enable the agent to effectively critique its own output and use that critique alongside external data for improvement.
Citations: The agent generates citations for the information retrieved from the search engine.
Technology Stack:
LLM: OpenAI GPT--4 (needed for strong reasoning and instruction following).
Function Calling: Crucial for structuring the agent's outputs (response, critique, search queries, citations) and enabling tool use.
Search Engine: Tavily (optimized for LLM applications).
Tracing/Debugging: LangSmith (essential for understanding the complex flow).
Inspiration: Based on the "Reflexion" paper and a LangChain blog post/example, refactored for clarity.
Use Case Example: Generating a detailed article on "AI-Powered SOC," including listing relevant startups and their funding, requiring dynamic web searches and iterative refinement.
Lesson Summary Bullet Points
Objective: Set up the Python project structure and environment for building the Reflexion Agent.
Steps:
Create Project Directory: Make a new folder for the project (e.g., reflexion-agent).
Initialize Poetry: Navigate into the directory and run poetry init to create a pyproject.toml file and set up the project metadata.
Create Virtual Environment: Poetry automatically manages the virtual environment associated with the project.
Install Dependencies: Use poetry add to install the required packages:
python-dotenv: For loading environment variables from a .env file.
black, isort: For code formatting (good practice).
langchain: The core LangChain library.
langchain-openai: For interacting with OpenAI models.
langgraph: The library for building stateful, multi-actor applications.
Create .env File: Create an empty .env file in the project root (touch .env).
Open in IDE (PyCharm): Open the created project folder in PyCharm.
Configure Interpreter: PyCharm should automatically detect the Poetry environment. Verify or configure the project interpreter to use the Poetry virtual environment so the installed packages are recognized.
Populate .env File: Add the necessary API keys to the .env file:
OPENAI_API_KEY
TAVILY_API_KEY (for the search engine)
LANGCHAIN_API_KEY (for LangSmith)
Configure LangSmith Tracing: Add the following to the .env file:
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=reflexion-agent (or your desired project name for LangSmith UI)
Create Main File: Create a main.py file.
Basic Main Setup: Add initial imports (from dotenv import load_dotenv, load_dotenv()) and a standard if __name__ == '__main__': block to main.py.
Configure Run Configuration (PyCharm): Set up a run configuration in PyCharm to execute the main.py script easily.
Test Setup: Run main.py to ensure the environment is loaded correctly and the basic structure works.
Lesson Summary Bullet Points
Objective: Implement the "Actor Agent" (specifically the first responder node) for the Reflexion architecture.
Role of Actor Agent: Takes the initial user query and generates the first response, which includes:
The main answer/content (e.g., a ~250-word essay).
A self-critique (identifying missing information and superfluous content).
Recommended search queries to gather information needed for improvement.
Key Technologies & Concepts:
LLM: OpenAI GPT-4 Turbo (required for reasoning, critique, and structured output generation).
Prompting:
Uses a ChatPromptTemplate with system instructions and placeholders.
System prompt defines the agent's role ("expert researcher") and the three required output components (response, critique, search queries).
Includes dynamic time placeholder (populated using a lambda function and .partial()).
Uses a first_instruction placeholder for the main task (e.g., "Provide a detailed ~250 word answer.").
Leverages MessagesPlaceholder to manage conversation history.
Structured Output (Pydantic):
Defines Pydantic BaseModel classes (Reflection, AnswerQuestion) in schemas.py to specify the desired output structure.
Reflection class has missing and superfluous fields for critique.
AnswerQuestion class includes answer (str), reflection (Reflection object), and search_queries (List[str]).
Uses Pydantic field descriptions to implicitly prompt/guide the LLM (e.g., describing what the reflection field should contain).
Function Calling:
Uses OpenAI's function calling capability to enforce the structured output.
Binds the AnswerQuestion Pydantic model as a tool to the LLM using llm.bind_tools().
Forces the LLM to use this specific tool/schema by setting tool_choice="AnswerQuestion".
Output Parsers:
Imports JSONOutputToolsParser and PydanticToolsParser from langchain_core.output_parsers.openai_tools.
Uses PydanticToolsParser to parse the LLM's function call output directly into the AnswerQuestion Pydantic object.
LangChain Expression Language (LCEL): Chains are constructed by piping components (prompt template -> LLM with bound tools -> output parser).
Debugging: Demonstrated running the chain and inspecting the output, including handling a ValidationError when the LLM initially failed to provide the required search_queries field.
Outcome: Created the first_responder_chain capable of generating an initial structured response, critique, and relevant search queries based on a user request. This forms the first node in the Reflexion agent graph.
Lesson Summary Bullet Points
Objective: Implement the "Revisor" node for the Reflexion Agent architecture.
Function: The Revisor node takes the previous response (including its critique and search terms), results from executed tools (web search results), and revises the response to improve it based on the critique and new information.
Implementation Steps:
Revision Instructions: Created a new set of instructions (revise_instructions) specifically for the revision task. These instruct the LLM to:
Use the previous critique and new information (search results) to revise the answer.
Add important information identified by the critique.
Include numerical citations for the new information to ensure verifiability.
Add a "References" section with URLs (not counting towards word limit).
Use the critique to remove superfluous/unnecessary information.
Adhere to a word limit (e.g., ~250 words).
Pydantic Schema (ReviseAnswer):
Created a new schema ReviseAnswer in schemas.py.
This class inherits from the AnswerQuestion schema (reusing answer, reflection, search_queries fields).
Added a new field: references: List[str] to hold the citation URLs.
Revision Chain (revisor):
Created a new chain in chains.py.
Used the original actor_prompt_template but partially filled the first_instruction placeholder with the new revise_instructions.
Piped this template to the LLM (GPT o4).
Used llm.bind_tools to associate the ReviseAnswer schema/tool with the LLM call.
Explicitly set tool_choice="ReviseAnswer" to force the LLM to output data matching the ReviseAnswer schema, including the new references field.
Key Concepts:
Prompt Engineering: Tailoring instructions (revise_instructions) for a specific task (revision based on critique and new data).
Schema Inheritance: Reusing existing schema definitions (AnswerQuestion) to build upon them for new tasks (ReviseAnswer).
Function Calling (Forced): Using tool_choice ensures the LLM generates output in the desired structured format (ReviseAnswer Pydantic model), including required fields like citations.
Outcome: The revisor chain/node is now implemented, ready to be added to the LangGraph agent graph. It handles the iterative refinement step of the Reflexion process.
Lesson Summary Bullet Points
Objective: Implement the Tool Executor node (using LangGraph's ToolNode) for the Reflexion agent.
Purpose: This node executes tools (like web searches) requested by the LLM based on the conversation state.
Core Tool: TavilySearch is used for performing web searches to gather real-time information.
Requires installing langchain-tavily.
Requires a Tavily API key stored in environment variables (.env).
Key LangGraph Component: ToolNode is a pre-built node that automatically:
Inspects the agent's state (specifically the messages list).
Looks for tool call requests in the last AI message.
Executes the specified tools with the provided arguments (can handle multiple calls, potentially in parallel).
Key LangChain Component: StructuredTool is used to wrap Python functions, providing them with a defined schema and description so LLMs and LangGraph can understand and use them correctly.
The StructuredTool.from_function() method is particularly useful here.
Implementation Strategy:
Created a Python function (run_queries) that takes a list of search queries and uses the TavilySearch tool's .batch() method to execute them.
Used StructuredTool.from_function() twice with the same run_queries function but gave each resulting tool a different name (AnswerQuestion and ReviseAnswer).
This naming distinction allows tracking why a search was performed (initial research vs. revision step) for better debugging and analysis, even though the underlying search mechanism is identical.
Final Node: Instantiated ToolNode by passing it the list containing the two differently named StructuredTool objects created above. This execute_tools node is now ready for the graph.
Lesson Summary Bullet Points
Objective: Assemble the previously built components (Responder, Revisor, Tool Executor) into a complete LangGraph graph for the Reflexion Agent.
Imports:
Standard libraries (List from typing).
LangChain core messages (BaseMessage, ToolMessage).
LangGraph components (END, MessageGraph).
Previously defined chains (revisor, first_responder from chains.py).
Tool execution logic (execute_tools from tool_executor.py).
Graph Setup:
Used MessageGraph as the graph type, meaning the state passed between nodes is a list of messages.
Defined a constant MAX_ITERATIONS = 2 to limit the number of revision loops.
Instantiated a MessageGraph object called builder.
Adding Nodes:
Added the first_responder chain as the draft node.
Added the execute_tools function (which wraps ToolNode) as the execute_tools node.
Added the revisor chain as the revise node.
Adding Edges:
Connected draft to execute_tools.
Connected execute_tools to revise.
Conditional Logic (event_loop):
Defined a function event_loop(state: List[BaseMessage]) -> str: to determine the next step after revision.
This function counts the number of ToolMessage instances in the state (representing completed tool execution/search cycles).
If the count (num_iterations) exceeds MAX_ITERATIONS, it returns the END keyword to stop the graph.
Otherwise, it returns the string "execute_tools" to loop back for another search/revision cycle.
Adding Conditional Edges:
Used builder.add_conditional_edges starting from the revise node.
Passed the event_loop function as the condition logic.
Defined the mapping implicitly: the strings returned by event_loop ("execute_tools" or END) determine the next node or termination.
Setting Entry Point:
Defined the draft node as the graph's entry point using builder.set_entry_point("draft").
Compiling the Graph:
Compiled the defined structure into a runnable graph object: graph = builder.compile().
Visualizing the Graph:
Printed an ASCII representation: print(graph.get_graph().draw_ascii()).
Generated a Mermaid PNG image: graph.get_graph().draw_mermaid_png(output_file_path="graph.png").
Running the Graph:
Invoked the compiled graph with an initial query: res = graph.invoke({"messages": [HumanMessage(content="Write about AI-Powered SOC...")]}).
Extracted the final answer from the last message's tool call arguments.
Debugging: Showcased using LangSmith to trace the execution flow, inspect inputs/outputs at each step, and understand the iterative process.
Video Summary
Introduction: This video introduces the project for this section: building an advanced, complex Retrieval-Augmented Generation (RAG) workflow.
Tool: The project heavily utilizes LangGraph to implement this complex workflow.
Goal: To create a RAG system that yields significantly higher quality results compared to standard RAG implementations.
Inspiration: The project is inspired by the LangChain & Mistral Cookbook and research papers on Self-RAG, Corrective RAG, and Adaptive RAG.
Key Concepts: The advanced workflow incorporates:
Reflection: Evaluating the relevance/correctness of retrieved documents and the generated answer's grounding and accuracy.
Routing: Directing user queries to the most appropriate data source (e.g., vector store vs. web search).
Approach: Unlike the original cookbook (which focused on notebooks), this section refactors the code with a software engineering perspective, aiming for:
Production-readiness
Maintainability
Readability
Testability
Extensibility
Structure: The implementation will be built gradually from scratch, explaining the software engineering choices along the way.
Resources: All code is available in a public GitHub repository, organized into branches corresponding to each video lesson.
Corrective Retrieval-Augmented Generation, or CRAG is a strategy for Retrieval-Augmented Generation (RAG) that incorporates self-reflection and self-grading on retrieved documents. This innovative approach aims to enhance the relevance and accuracy of generated responses.
Flow of CRAG:
Retrieve Documents: The process starts by retrieving relevant documents from a dataset.
Evaluate Relevance: These documents are then evaluated for their relevance to the user question.
Fallback Mechanism: If any documents are found irrelevant, a web search is used as a fallback to find more pertinent information.
Dynamic Control Flow: Using LangGraph, we create a dynamic and adaptive workflow where each node in the graph modifies the state, and edges dictate the next steps based on relevance checks.
Inspiration and Refactoring:
This video series is inspired by the LangChain Mistral AI Cookbook Notebook. I took their example and made several refinements and refactoring to make it more suitable for production use. The refactoring focuses on improving readability, maintainability, and adding tests to ensure robust performance.
Reference:
LangChain Mistral AI Cookbook Notebook
https://github.com/mistralai/cookbook/blob/main/third_party/langchain/corrective_rag_mistral.ipynb
Video Summary
This video covers the initial boilerplate setup for the LangGraph project.
Directory Creation: A new project directory (langgraph-course) is created.
Environment Management: Poetry is used to initialize the project (poetry init) and manage the virtual environment.
Dependency Installation: Key dependencies are installed using poetry add, including:
beautifulsoup4 (for potential web loading)
langchain, langgraph, langchain-hub, langchain-community
langchain-tavily (for Tavily search API)
chromadb (vector store)
python-dotenv (for environment variables)
black, isort (code formatting)
pytest (testing framework)
IDE Configuration: PyCharm is opened, and the project interpreter is configured to use the Poetry environment.
Environment Variables: A .env file is created to store sensitive keys:
OPENAI_API_KEY
LANGCHAIN_API_KEY (for LangSmith)
TAVILY_API_KEY
Configuration for LangSmith tracing (LANGCHAIN_TRACING_V2, LANGCHAIN_PROJECT).
PYTHONPATH is set to the project root.
Basic Code: A main.py file is created with basic code to load environment variables and print a "Hello Advanced RAG" message for a sanity check.
Testing Emphasis: The importance of testing generative AI applications using tools like Pytest is highlighted.
Code Location: The code for this setup is available on the 1-start-here branch on GitHub.
Video Summary (Bullet Points)
Goal: Define a robust repository structure for the LangGraph project, prioritizing readability, maintenance, and testability.
Rationale: The structure should reflect the application's architecture (the LangGraph graph itself). This approach moves away from simple Jupyter notebooks towards production-ready code.
Inspiration: Based on refactoring the LangChain team's advanced RAG tutorial for better software engineering practices.
Proposed Structure:
Root Directory:
ingestion.py: Handles downloading data and indexing it into a vector store (e.g., ChromaDB).
main.py: (Implied) Entry point for the application.
.env, .gitignore, poetry.lock, pyproject.toml, README.md, LICENSE: Standard project files.
graph/ Package: Contains the core LangGraph logic.
graph.py: Defines the LangGraph graph, connecting nodes and edges.
state.py: Defines the GraphState object passed between nodes.
consts.py: Stores constant values (e.g., node names).
nodes/ Sub-package: Each .py file implements a specific node in the graph.
chains/ Sub-package: Each .py file implements a LangChain chain used by a corresponding node.
tests/ Sub-sub-package: Contains tests for the chains.
test_*.py (e.g., test_chains.py): Test files specifically for the chains, using Pytest conventions.
Testing Strategy:
Utilize Pytest for running tests.
Follow Pytest naming conventions (e.g., tests/ directory, test_*.py files, test_ function prefix).
Demonstrated setting up a Pytest run configuration in PyCharm.
Showed running a basic assert 1 == 1 test using both the terminal (pytest . -s -v) and the PyCharm runner.
Disclaimer: This is the instructor's preferred structure based on experience; it's not claimed to be the absolute best or only way. The focus is on creating a maintainable and extensible structure.
Code Access: The code for this stage is available on the 2-project-structure branch of the course's GitHub repository.
Next Steps: Implement the vector store ingestion logic in ingestion.py using ChromaDB.
Video Summary
Goal: Implement the ingestion pipeline to load, process, and store documents in a vector store for the RAG application.
Focus: This video covers the ingestion.py file.
Disclaimer: While advanced RAG techniques will be used later for retrieval, the ingestion part uses standard, default methods for simplicity, as the course focuses more on the retrieval/generation workflow.
Steps:
Imports: Import necessary libraries: load_dotenv, RecursiveCharacterTextSplitter, WebBaseLoader, Chroma, OpenAIEmbeddings.
Load Environment Variables: Use load_dotenv() to access API keys (especially OpenAI for embeddings).
Define Data Sources: Specify a list of URLs pointing to blog posts (about autonomous agents, prompt engineering, adversarial attacks) to be scraped.
Load Documents: Use WebBaseLoader in a list comprehension to iterate through the URLs and load the content of each page into LangChain Document objects.
Flatten List: The WebBaseLoader returns a list of lists; flatten this into a single list of Document objects (docs_list).
Chunk Documents:
Initialize RecursiveCharacterTextSplitter using from_tiktoken_encoder.
Set chunk_size=250 and chunk_overlap=0.
Use the splitter's split_documents method on docs_list to create smaller document chunks (doc_splits).
Embed and Store:
Use Chroma.from_documents to:
Take the doc_splits as input.
Specify a collection_name ("rag-chroma").
Provide the OpenAIEmbeddings function (defaults to text-embedding-3-small).
Set persist_directory="./.chroma" to save the vector store locally.
Create Retriever:
Instantiate a Chroma client again, pointing to the same collection_name and persist_directory, and providing the embedding_function.
Call .as_retriever() on the Chroma client object to get a LangChain Retriever object (used later for similarity searches).
Verification: Ran the script to ensure it created the .chroma directory with the persisted vector store.
Code Management: Commented out the vector store creation part (Chroma.from_documents) to prevent re-indexing on every run. The retriever part now loads the persisted store.
Code Location: The code for this video is available on the 3-ingestion branch of the course's GitHub repository
Video Summary (Bullet Points)
Purpose: This video defines the GraphState for the LangGraph application.
GraphState: This is the central data structure that holds information passed between nodes during the graph's execution.
Implementation:
The GraphState class is created in the graph/state.py file.
It inherits from typing.TypedDict.
State Attributes: The GraphState will contain the following key pieces of information:
question (str): The original user query. Needed for reference in various nodes (e.g., checking document relevance, determining web search queries).
generation (str): The final answer generated by the LLM.
web_search (bool): A flag indicating whether a web search is needed to gather additional information.
documents (List[str]): A list containing the textual content of documents relevant to answering the question. These can be documents retrieved from a vector store or results from a web search.
Summary
Objective: Implement the grade_documents node in a LangGraph application.
Functionality: This node iterates through documents retrieved by a previous node and determines their relevance to the original user question.
Core Component: A retrieval_grader chain is created using LangChain.
Structured Output: The chain leverages the with_structured_output method and a Pydantic class (GradeDocuments) to force the LLM (defaulting to GPT-3.5 with function calling) to return a binary score ('yes' or 'no').
Filtering: Documents graded as 'yes' (relevant) are kept in a filtered_docs list; irrelevant documents are discarded.
Web Search Logic: If any document is found to be irrelevant ('no'), a web_search flag in the graph state is set to True, signaling a potential need for web search later in the graph.
Node Implementation: The grade_documents function takes the graph state, extracts the question and documents, iterates through documents calling the retrieval_grader chain, filters based on the score, updates the web_search flag if needed, and returns the updated state with filtered documents and the potentially updated flag.
Testing: The video demonstrates writing pytest tests for the retrieval_grader chain, covering both relevant ('yes') and irrelevant ('no') scenarios. It also discusses the challenges of testing LLM applications (non-idempotency, third-party reliance, cost).
Video Summary
Goal: Implement the websearch node for the LangGraph agent.
Tool: Leverages the Tavily Search API, which is optimized for LLM applications.
Prerequisite: Requires the TAVILY_API_KEY to be set in the .env file.
Implementation Steps:
Create a new file graph/nodes/web_search.py.
Import necessary modules: typing (Any, Dict), langchain.schema (Document), langchain_tavilý (TavilySearch), graph.state (GraphState).
Initialize the TavilySearch tool, setting max_results=3.
Define the web_search function:
Takes state: GraphState as input.
Returns Dict[str, Any] to update the state.
Prints a debug message ("---WEB SEARCH---").
Extracts the question and existing documents from the input state.
Invokes the web_search_tool using the question.
Processes the Tavily results:
Iterates through the list of result dictionaries.
Extracts the content from each dictionary.
Joins the extracted content strings into a single string (joined_tavily_result) separated by newlines (\n).
Creates a LangChain Document object (web_results) using the joined_tavily_result as page_content.
Updates the documents list in the state:
If documents already exist in the state (meaning relevant documents were found earlier), append the web_results document.
If documents is None (meaning no relevant documents were found earlier), create a new list containing only the web_results document.
Returns a dictionary to update the graph state with the new documents and the original question.
Debugging: Demonstrated debugging the function, inspecting Tavily results, and verifying the joined string output.
File Renaming: Renamed the file from websearch.py to web_search.py (adding an underscore).
Video Summary
Goal: Implement the generate node, which is the final step in the main RAG flow before potential self-reflection/correction.
Functionality: Takes the relevant documents (context) and the original question, feeds them to an LLM via a predefined prompt, and generates the final answer.
Implementation Steps:
Chain Definition (graph/chains/generation.py):
Imports hub (from langchain), StrOutputParser, ChatOpenAI.
Initializes the ChatOpenAI LLM (setting temperature=0).
Pulls the standard rlm/rag-prompt from LangChain Hub.
Creates the generation_chain using LangChain Expression Language (LCEL): prompt | llm | StrOutputParser().
Testing (graph/chains/tests/test_chains.py):
Adds a test function test_generation_chain.
Imports pprint and the generation_chain.
Sets a sample question ("agent memory").
Retrieves relevant docs using the retriever.invoke(question).
Invokes the generation_chain with the retrieved docs as context and the question.
Uses pprint to print the generation (serves as a sanity check, not a formal assertion).
Runs all tests using the PyCharm Pytest runner to confirm they pass.
Briefly shows the trace in LangSmith, highlighting the retriever step and the final runnable sequence (prompt -> LLM -> parser).
Node Definition (graph/nodes/generate.py):
Imports typing (Any, Dict), generation_chain from graph.chains.generation, GraphState from graph.state.
Defines the generate function:
Takes state: GraphState as input.
Returns Dict[str, Any] to update the state.
Prints a debug message ("---GENERATE---").
Extracts question and documents from the input state.
Invokes the generation_chain with context=documents and question=question.
Returns a dictionary updating the graph state with the original documents, question, and the newly generated generation.
Key Components:
RAG Prompt: Uses a standard prompt template from LangChain Hub designed for question answering with context.
Generation Chain: A simple LCEL sequence combining the prompt, LLM, and output parser.
Generate Node: The function that orchestrates calling the generation chain with the correct inputs from the graph state and updates the state with the result.
Goal: Build the final LangGraph graph by connecting all the previously defined nodes and edges.
Steps:
Imports (graph/__init__.py & graph/graph.py):
In graph/nodes/__init__.py, import all node functions (generate, grade_documents, retrieve, web_search) and define __all__ to make them easily importable from the package.
In graph/graph.py, import load_dotenv, END and StateGraph from LangGraph, node constants (RETRIEVE, GRADE_DOCUMENTS, etc.) from graph.consts, all node functions from graph.nodes, and GraphState from graph.state.
Define Node Constants (graph/consts.py): Create constants (e.g., RETRIEVE = "retrieve") for node names to avoid typos and code duplication.
Define Conditional Edge Logic (graph/graph.py):
Create a function decide_to_generate(state) that checks the state["web_search"] flag.
If True (meaning a document was irrelevant), return the WEBSEARCH constant (node name).
If False (meaning all retrieved documents were relevant), return the GENERATE constant.
Build the Graph (graph/graph.py):
Instantiate workflow = StateGraph(GraphState).
Use workflow.add_node() to add each node (RETRIEVE, GRADE_DOCUMENTS, GENERATE, WEBSEARCH), mapping the node name constant to the corresponding function.
Use workflow.set_entry_point(RETRIEVE) to define the starting node.
Use workflow.add_edge(RETRIEVE, GRADE_DOCUMENTS) for the first linear connection.
Use workflow.add_conditional_edges():
Starting from GRADE_DOCUMENTS.
Using the decide_to_generate function to determine the next node.
Optionally providing a path_map (e.g., {"websearch": WEBSEARCH, "generate": GENERATE}) to map the function's return value to node names explicitly (explained but noted as optional here).
Use workflow.add_edge(WEBSEARCH, GENERATE) for the edge after web search.
Use workflow.add_edge(GENERATE, END) for the final edge.
Compile and Visualize:
Compile the graph: app = workflow.compile().
Generate a visualization: app.get_graph().draw_mermaid_png(output_file_path="graph.png").
Run the Graph (main.py):
Import the compiled app from graph.graph.
Invoke the graph: print(app.invoke({"question": "what is agent memory?"})).
Verification:
Ran main.py and observed the print statements confirming the execution flow (retrieve -> grade -> web search -> generate).
Showed the generated graph.png matching the intended structure.
Examined the trace in LangSmith, confirming the node execution order and the input/output for each step, including the Tavily call during web search.
Video Summary
Introduction: This video introduces the implementation of Self-RAG end-to-end.
Self-RAG Concept: Derived from the Self-RAG paper, it focuses on reflecting on the generated answer.
Reflection Steps:
Hallucination Check: Verify if the generated answer is grounded in the provided documents.
Answer Relevance Check: If grounded, verify if the answer actually addresses the original user question.
Conditional Logic:
If the answer is grounded and answers the question, return it to the user.
If the answer is grounded but does not answer the question, trigger a web search (assuming the vector store lacks the necessary information).
If the answer is not grounded (hallucinated), regenerate the answer, forcing it to be grounded in the documents.
Implementation Scope: The video will cover implementing the necessary chains, tests, nodes, and conditional edges to achieve this Self-RAG workflow.
Summary
Introduction: Introduces the Self RAG pattern to improve Retrieval-Augmented Generation by adding reflection steps after LLM generation.
Implementation Tool: Demonstrates the implementation using LangGraph.
Core Change: Modifies the standard RAG flow to include answer evaluation before finalizing the output.
Key Components - Graders:
Builds a hallucination_grader chain to check if the generated answer is factually grounded in the retrieved documents.
Builds an answer_grader chain to check if the generated answer is relevant to the original user question.
Both graders use LangChain's structured output with Pydantic models to return a simple boolean (True/False or yes/no) result.
Testing: Includes writing and running unit tests (using Pytest) for both grader chains to verify their logic.
Graph Logic - Conditional Edge:
Adds a new conditional edge function (grade_generation_grounded_in_documents_and_question) triggered after the generate node.
This function runs the hallucination_grader first. If the answer is grounded, it then runs the answer_grader.
Graph Logic - Routing (path_map):
Uses LangGraph's path_map feature in add_conditional_edges to map the string outputs ("useful", "not useful", "not supported") from the conditional function to specific graph nodes.
Routes:
useful (Grounded & Relevant) -> END (Return answer to user).
not useful (Grounded but Irrelevant) -> WEBSEARCH (Get better context).
not supported (Not Grounded/Hallucinated) -> GENERATE (Retry generation with existing context).
Visualization & Tracing: Runs the complete Self RAG graph, shows the console output demonstrating the flow, and mentions reviewing the detailed trace in LangSmith. The generated graph diagram visually represents the new conditional flow.
Outcome: Creates a more sophisticated and robust RAG system capable of self-correction and deciding appropriate next steps based on the quality of the generated answer.
Code Availability: Mentions the code is available on GitHub
Summary
Introduction: Introduces Adaptive RAG, implemented using LangGraph, as the final part of the section.
Concept: Adaptive RAG dynamically routes a user's question to different processing flows (RAG pipelines) based on the question's nature. It's based on a research paper.
Core Component - Question Router:
Implements a question_router chain using LangChain.
This chain takes the user's question and decides whether the answer likely exists within the pre-indexed vectorstore or if a websearch is required.
Uses structured output (Pydantic RouteQuery class with Literal type) to enforce the output to be either "vectorstore" or "websearch".
The prompt guides the LLM: Use vector store for specific topics (agents, prompt engineering, adversarial attacks); use web search for everything else.
Testing: Includes writing and running unit tests (using Pytest) for the question_router chain, verifying it correctly routes questions to both "vectorstore" and "websearch" based on the topic.
Graph Logic - Conditional Entry Point:
Integrates the question_router into the main LangGraph workflow.
Introduces and uses set_conditional_entry_point instead of set_entry_point.
This function acts as the first step after starting the graph (__start__). It runs the route_question function (which invokes the question_router chain).
Graph Logic - Routing (path_map):
Based on the output of route_question ("vectorstore" or "websearch"), the path_map directs the graph to the appropriate next node (RETRIEVE or WEBSEARCH).
Overall Flow: The graph now starts, routes the question using the question_router, and then proceeds down either the vector store retrieval path or the web search path. The rest of the graph logic (grading, generation, reflection) remains the same as implemented previously.
Demonstration: Runs the final graph with two different questions ("what is agent memory?" and "how to make pizza?") to show the conditional entry point routing the execution correctly to either the RETRIEVE or WEBSEARCH node, respectively.
3. Summary:
Introduction to the project: Building a ReAct agent executor using LangGraph.
Learn why LangGraph is a powerful tool for creating complex agent flows like ReAct.
Understand the core ReAct (Reasoning and Acting) pattern.
Explore key LangGraph concepts:
Defining agent workflows as graphs.
Understanding and implementing custom Graph States.
Integrating external tools (Tavily Search, custom functions).
Building loops and conditional logic within the agent graph.
See a practical example: An agent that fetches weather and performs a calculation.
Goal: Create a fully functional ReAct agent implemented entirely with LangGraph.
Video Description Summary:
This video covers the initial project setup for building a React LangGraph agent with function calling.
Project Initialization: Learn how to start a Python project using Poetry for dependency management.
Environment Setup: Set up a .gitignore file to keep sensitive information (like API keys in a .env file) out of version control.
Dependency Installation: Install all necessary Python packages using Poetry, including langchain, langchain-openai, langchain-tavily (for search), langgraph, python-dotenv, and formatting tools (black, isort).
API Key Configuration: Configure the .env file with required API keys for OpenAI, LangChain tracing (for observability), and Tavily search.
File Structure: Create the basic Python files (main.py, react.py, nodes.py) that will be used in subsequent videos to implement the LangGraph agent.
Code Availability: All code changes from this setup step are committed and available in the linked GitHub repository for reference.
After this video, your project environment will be ready to start building the LangGraph agent.
Video Description Summary:
This video focuses on implementing the reasoning logic for the React LangGraph agent using Function Calling.
Import Necessary Libraries: Imports include load_dotenv (to load API keys), the @tool decorator (to define custom tools), ChatOpenAI (to use OpenAI models), and TavilySearch (a pre-built search tool).
Define Custom Tools: Implement a simple triple function and mark it as a tool using the @tool decorator.
Utilize Pre-built Tools: Incorporate the TavilySearch tool from LangChain.
Introduce Function Calling: Explain how modern LLMs (like OpenAI models) handle tool selection by understanding tool definitions provided during initialization and returning structured tool calls in their response.
Contrast with ReAct: Briefly compare function calling to the older ReAct prompting method, highlighting the advantages of letting the LLM vendor handle parsing and response formatting.
Bind Tools to the LLM: Demonstrate how to attach the defined tools to the ChatOpenAI instance using the bind_tools() method.
Code Check: Verify that the implemented code runs without syntax errors.
Commit Changes: The code changes are committed to the GitHub repository under the "function calling reasoning" commit.
Preview Next Step: The following video will implement the executable nodes for the LangGraph graph structure.
Video Description Summary:
This video focuses on building the LangGraph structure in main.py.
Necessary components like MessagesState and StateGraph are imported from langgraph.graph, along with the previously defined reasoning and tool nodes from nodes.py.
Constants like AGENT_REASON, ACT, and LAST are defined for cleaner code.
A StateGraph is initialized using MessagesState to manage the conversation history.
The agent_reasoning and act nodes are added to the graph using flow.add_node().
The entry point for the graph is set to the agent_reasoning node using flow.set_entry_point().
A conditional edge is added from agent_reasoning to either the act node or the end node based on the logic in the should_continue function.
The should_continue function checks if the last message in the state contains tool calls to determine the next step ('ACT' or 'END').
A standard edge is added from the act node back to the agent_reasoning node.
The graph is compiled using flow.compile().
The graph structure is visualized by generating a Mermaid PNG file.
The next video will involve testing the completed graph and reviewing the execution traces
Video Description Summary:
Invoking the Graph: The video demonstrates running the compiled LangGraph agent using app.invoke().
Providing Input: A sample query "What is the weather in Tokyo? List it and then triple it" is provided as a HumanMessage in the initial state.
Tracing Execution with LangSmith: LangSmith tracing is used to visualize the execution flow of the graph.
Observing Node Execution: The trace shows the sequence of nodes being executed: agent_reasoning -> act -> agent_reasoning -> act -> agent_reasoning -> end.
Analyzing Tool Usage: The trace details how the LLM (within the agent_reasoning node) outputs tool calls, and the tool_node executes these calls (Tavily Search and the custom triple tool).
Handling Ambiguity: The trace reveals that Tavily Search was called twice because the initial search result didn't contain the specific temperature needed, prompting the agent to search again with modified parameters.
Successful Tool Chaining: The agent successfully extracted the temperature from the search results and passed it as input to the triple tool, demonstrating effective function calling and tool chaining.
Final Output: The final output includes the weather information found by Tavily Search and the tripled value of the temperature calculated by the custom tool.
Code Availability: The code for the graph implementation is committed and available in the GitHub repository under the "graph" commit.
Implementing Parallel Execution
This video explains how to implement parallel execution in LangGraph, a Python library for graph-based workflows. Topics covered:
Parallel node fan-out and fan-in
Multi-step parallel processes
Conditional branching in parallel workflows
Stable sorting for consistent parallel execution results
LangGraph Studio: Installation and Usage Guide
Summary
This guide introduces LangGraph Studio (also known as LangGraph IDE), a new beta tool from the LangChain team for debugging and visualizing LangGraph applications. It offers real-time node execution monitoring, state inspection, and supports rapid development iterations.
Key Features
Real-time node execution monitoring
State inspection before and after node execution
Breakpoint setting
Live updates reflecting code changes
Prerequisites
Mac computer with Apple Silicon (currently)
LangSmith account (free tier available)
Docker installed and running
Installation Steps
Download the DMG file from the provided repository
Drag the application to the Applications folder
Run the application and log in with LangSmith credentials
Configuration
1. Create a `langraph.json` file with the following structure:
```json
{
"agent": {
"path": "/graph/graph.py:app",
"env": ".env",
"dependencies": ["."]
}
}
```
2. Update `pyproject.toml` to include the graph package
Starting the Application
1. Open the project in LangGraph Studio
2. Wait for the Docker containers to load (including LangServe debugger and Postgres)
Interface Overview
- Left side: Visualization of the graph
- Top left: Display name of the graph (e.g., "agent")
- Input box: For entering the initial state (e.g., question)
Running and Debugging
1. Enter a question in the input field (e.g., "What is agent memory?")
2. Submit to run the graph
3. Observe real-time node execution
4. Inspect state at each node
5. Use the "fork" feature to modify execution (e.g., skipping web search)
How It Works
- Uses Docker containers for the debugger, Postgres database, and the LangGraph application
- Persists state after each node execution in the Postgres database
Benefits
- Shortens development lifecycle
- Enables quick iterations in LangGraph agent development
- Provides better visibility into LangGraph logic and application flow
Limitations
- Currently in beta
- Only supports Mac computers with Apple Silicon (as of the recording)
Future Prospects
- Expected support for other operating systems
- Integration with LangGraph Cloud for similar functionality in the cloud
LangGraph Local Setup and Deployment Guide
## Summary
This guide walks through the process of setting up and running a LangGraph application locally using the LangGraph CLI. It covers the necessary steps from configuring the environment to running the application in a Docker container.
## Outline
1. Accessing LangGraph Cloud Console
- Navigate through LangSmith
- Click on the rocket icon to access LangGraph cloud console
2. LangGraph JSON file
- Contains environment variables
- Specifies graph path
- Lists dependencies
3. Local Setup Process
- Install LangGraph CLI
- Run `landgraph up` command
4. LangGraph CLI Commands
- `langgraph help`: View available commands
- `langgraph dockerfile`: Generate Dockerfile for LangGraph API server
- `langgraph build`: Build Docker image
- `langgraph up`: Create and run Docker container
5. Dockerfile Generation
- Base image: Pre-built LangGraph API image
- Includes necessary environment variables
7. Running the Application
- Execute `landgraph up`
- API accessible at localhost:8123
- Documentation available at localhost:8123/docs
8. API Documentation
- Automatically generated by LangChain
- Accessible through provided URL
## Key Points
- LangGraph simplifies the process of setting up and running LLM-powered applications locally
- The LangGraph CLI provides easy-to-use commands for generating Dockerfiles, building images, and running containers
- The local setup includes both an API server and a Postgres database for state management
- Documentation is automatically generated, making it easier to understand and interact with the API
## Next Steps
- Explore the automatically generated API documentation
- Test the locally running LangGraph application
LangGraph Cloud API Video Summary and Outline
Summary
This video discusses the LangGraph Cloud API, created by LangChain to simplify the process of building and deploying LLM-powered applications.
The API provides endpoints for managing assistants, threads, runs, and cron jobs, automatically generated from a compiled graph.
We will explain the key components of the API, demonstrates how to use various endpoints, and highlights the benefits of using this system for developing and deploying AI applications.
Introduction to LangGraph Cloud API
Created by LangChain and not open sourced
Built with OpenAPI specification
Automatically generated from compiled graph
Key Components of the API
Assistants
Threads
Runs
Cron jobs
Assistants
Definition: Abstraction of compiled graph instance
Creating an assistant
Required parameters: assistant ID, graph ID
Optional: configuration, metadata
Retrieving assistant information
Threads
Definition: Container for accumulated state of multiple invocations
Sharing threads across assistants
Runs
Definition: Invocation of a graph with provided input
Creating a run
Required parameters: assistant ID, thread ID
Optional: checkpoint ID, configuration, metadata
Monitoring run status
Data Storage and Management
Local storage: PostgreSQL container
Production environment: LangGraph cloud offering
API Usage Demonstration
Creating an assistant
Retrieving assistant information
Creating a thread
Creating and monitoring a run
Retrieving thread information and results
Benefits of LangGraph Cloud API
Simplifies backend-frontend integration
Handles user management
Provides useful endpoints for LLM applications
Manages data storage and scalability
Advanced Features
Persistence and checkpoints
State management across executions
Filtering and tagging
Deployment Options
Local development setup
Cloud deployment (mentioned for next video)
CopilotKit? is an open-source platform for integrating AI copilots (assistants) into applications quickly and easily. In the context of generative UI, CopilotKit offers the following features:
Generative UI Components: CopilotKit allows developers to create dynamic user interfaces with AI-generated components within a chat UI. This is particularly useful when working with AI agents.
Real-time UI Updates: The platform enables real-time updates to the UI based on the AI agent's state and actions.
Custom Rendering: Developers can define custom rendering functions to display agent states or responses in the UI.
Relationship with LangGraph ??️:
Integration Bridge: CopilotKit, specifically its CoAgents feature, acts as a bridge between LangGraph agents and user interfaces. LangGraph is a framework for building stateful, multi-step AI agents.
State Streaming: CopilotKit provides tools to stream the state of LangGraph agents to the frontend in real-time. This allows the UI to reflect the current state and progress of the AI agent.
Dynamic Interaction: It enables dynamic interactions between LangGraph agents and the user interface, allowing for a more interactive and collaborative AI experience.
Custom UI for Agent States: Developers can use CopilotKit to create custom UI components that represent different states or outputs of a LangGraph agent.
Simplified Integration: CopilotKit simplifies the process of connecting LangGraph-based AI agents to web applications, making it easier to create interactive AI experiences.
Welcome to first LangGraph Udemy course - Unleashing the Power of LLM Agents!
This comprehensive course is designed to teach you how to QUICKLY harness the power the LangGraph library for LLM agentic applications.
This course will equip you with the skills and knowledge necessary to develop cutting-edge LLM Agents solutions for a diverse range of topics.
Please note that this is not a course for beginners. This course assumes that you have a background in software engineering and are proficient in Python & LangChain. I will be using Pycharm IDE but you can use any editor you'd like since we only use basic feature of the IDE like debugging and running scripts .
The topics covered in this course include:
LangChain
LCEL
LangGraph
Agents
Multi Agents
Reflection Agents
Reflexion Agents
LangSmith
CrewAI VS LangGraph
Advanced RAG
Corrective RAG
Self RAg
Adaptive RAG
GPT Researcher
LangGraph Ecosystem:
LangGraph Studio / LangGraph IDE
LangGraph Cloud API
LangGraph Cloud Managed Service
Throughout the course, you will work on hands-on exercises and real-world projects to reinforce your understanding of the concepts and techniques covered. By the end of the course, you will be proficient in using LangGraph to create powerful, efficient, and versatile LLM applications for a wide array of usages.
This is not just a course, it's also a community. Along with lifetime access to the course, you'll get:
Dedicated troubleshooting support with me
Github links with additional AI resources, FAQ, troubleshooting guides
No extra cost for continuous updates and improvements to the course
DISCLAIMERS
Please note that this is not a course for beginners. This course assumes that you have a background in software engineering and are proficient in Python.
I will be using Pycharm IDE but you can use any editor you'd like since we only use basic feature of the IDE like debugging and running scripts.