LangChain & LangGraph — The Complete Guide to Building AI Agents

LangChain and LangGraph — what is the difference?

They are related but distinct. LangChain is a toolkit — a large collection of pre-built components for working with AI models: connectors for different LLMs, tools for searching the web, reading documents, querying databases, and chaining operations together. Think of it as a parts catalogue for building AI applications.

LangGraph is a framework built on top of LangChain specifically for agents. Where LangChain gives you the parts, LangGraph gives you the architecture. It defines how those parts are assembled into a reliable, stateful workflow that can loop, branch, pause for human input, and recover from errors.

LangChain itself recommends using LangGraph for any agent system, rather than LangChain's older agent abstractions. The two work together: LangGraph provides the flow control, LangChain provides the tools and model connections.

Why it became the most widely used option

LangChain launched in October 2022 — just weeks before ChatGPT — and captured early adopters building the first wave of LLM applications. By the time others emerged, LangChain had the biggest ecosystem, the most documentation, and the most Stack Overflow answers. Being first mattered enormously in a new field.

The practical effect: if you search for how to do something with AI agents, there is a strong chance the answer involves LangChain. More integrations, more examples, more developers who know it.

What LangGraph adds

LangGraph represents agent workflows as a graph — a set of nodes (each node is a step, a model call, or a tool) connected by edges (transitions between steps). This is more powerful than a simple linear chain for three reasons:

Cycles are possible — the agent can loop back to an earlier step based on what it observes. Essential for self-correction and retry logic.
State is explicit — the graph carries a state object through every node. Every step can read and update this state. Nothing is lost between steps.
Human-in-the-loop is built in — the graph can be paused at any node, a human can review and modify the state, and execution can resume from the same point.

Who should use LangChain/LangGraph: Developers building production AI applications who want the largest ecosystem, the most integrations, and the ability to find help when things break. It is not the simplest framework to learn, but it is the most capable and the most supported.

What it costs

LangChain and LangGraph are both free and open source under the MIT licence. You can download and use them with no cost. LangSmith — the observability and debugging platform built by the same team — has a free tier (up to 5,000 traces per month) and paid plans starting at $39/month per user for teams. LangSmith is optional but highly recommended for production use.

What LangChain provides

LangChain's core value is breadth. The framework provides standardised interfaces for the components every AI application needs:

Model connectors — unified API for OpenAI, Anthropic, Google, Mistral, Cohere, Ollama (local), and 50+ other LLM providers. Switch models by changing one line.
Document loaders — load content from PDFs, Word documents, web pages, Notion, Google Drive, databases, APIs, and dozens of other sources into a consistent format.
Text splitters — divide long documents into chunks suitable for embedding and retrieval, with control over chunk size and overlap.
Vector store connectors — connect to Pinecone, Chroma, Weaviate, Qdrant, FAISS, and other vector databases for semantic search and RAG.
Tool integrations — pre-built tools for web search (Tavily, SerpAPI), code execution, calculator, weather, and hundreds of API integrations.
Memory — conversation history, summary memory, entity tracking, and vector store-backed long-term memory.

What LangGraph provides

LangGraph adds workflow architecture on top. A LangGraph application is defined as:

A state schema — a typed data structure that holds everything the agent needs to know at any point in its workflow
Nodes — Python functions that receive the current state and return an updated state. Each node is one step: a model call, a tool call, a human review point.
Edges — connections between nodes. Edges can be unconditional (always go to node B after node A) or conditional (go to node B or node C depending on what node A returned).
A compiled graph — the assembled workflow, which LangGraph validates before running to catch structural errors early.

Prompts for building and working with LangChain/LangGraph

Use these with Claude, ChatGPT, or any capable model when building with LangChain and LangGraph.

Getting started and understanding

Explain LangGraph to a developer

I understand Python and I have used LLM APIs directly. Explain what LangGraph adds. What problem does the graph structure solve that a simple loop doesn't? Give me a concrete example of a workflow that is hard to build without it.

LangChain vs LangGraph — which should I use?

I want to build an agent that can search the web, read documents, and produce a structured report. Should I use LangChain's legacy agent abstractions, LangGraph, or something else entirely? Explain the tradeoffs and give me a recommendation with reasoning.

Understand a LangGraph workflow

Here is a LangGraph workflow: [paste code]. Walk me through what it does step by step. What state does it carry? What does each node do? What are the conditional edges checking? What could go wrong and where?

Building agents

Build a research agent with LangGraph

Write a LangGraph agent in Python that: (1) takes a research question as input, (2) searches the web using Tavily, (3) reads the top 3 results in full, (4) synthesises a structured summary with sources cited. Use a typed state object. Include error handling if a URL can't be fetched. Use Claude claude-sonnet-4-20250514 as the model.

Add a human review checkpoint

I have a LangGraph workflow that produces a draft document. Before it proceeds to the publishing step, I want a human to be able to review and optionally edit the draft. Show me how to add an interrupt point in LangGraph that pauses execution, presents the current state, and allows the state to be updated before continuing.

Build a multi-agent system with LangGraph

Build a LangGraph multi-agent system with three agents: a researcher, a writer, and a fact-checker. The researcher gathers information using web search. The writer produces a structured article from the research. The fact-checker identifies any claims that seem unsupported and flags them. The orchestrator decides whether the output is acceptable or should be revised. Show the full graph definition.

Add retry logic to a failing node

One of my LangGraph nodes calls an external API that occasionally fails or times out. Show me how to add retry logic — maximum 3 attempts with exponential backoff — that retries the API call before routing to an error handler node. The rest of the workflow should continue normally if the retry succeeds.

RAG and document workflows

Build a RAG pipeline with LangChain

I have a folder of PDF documents. Using LangChain, write Python code to: (1) load all PDFs, (2) split them into chunks of 1000 characters with 200 character overlap, (3) embed them using OpenAI's text-embedding-3-small model, (4) store them in a Chroma vector database, (5) create a retriever that finds the 5 most relevant chunks for any query. Include the query step.

Combine RAG with a LangGraph agent

I have a Chroma vector store built with LangChain. Show me how to make this retriever available as a tool inside a LangGraph agent, so the agent can decide when to query the knowledge base versus when to search the web, based on the nature of the question.

Debugging and observability

Debug a LangGraph workflow that's looping

My LangGraph agent is looping — it keeps going back to the same node without making progress. Here is the graph definition: [paste code]. Help me identify why the loop condition is not being satisfied and what I need to change to ensure the agent eventually reaches a terminal state.

Add LangSmith tracing

Show me how to add LangSmith tracing to my existing LangGraph application. I want every node execution, tool call, and model call to be logged with timing data. Show the environment variables needed and any code changes required. I'm using Python.

Write evaluation tests for a LangGraph agent

I have a LangGraph research agent. I want to evaluate its performance across 20 test questions. For each question, I want to measure: whether the final answer addresses the question, whether sources are cited, and whether the answer contains any factual claims that contradict the source documents. Show me how to set up this evaluation using LangSmith's evaluation framework.

LangGraph architecture — the state machine model

LangGraph implements a stateful, cyclical computation graph where each node is a function and edges define control flow. This is architecturally similar to finite state machines but with typed state objects rather than discrete states, and conditional edges that compute the next state transition at runtime.

The central abstraction is the StateGraph — a graph that carries a typed state object (defined as a TypedDict or Pydantic model) through its nodes. Each node is a Python function with signature def node_fn(state: StateType) -> StateType — it receives the current state and returns an updated state. LangGraph merges the returned partial state into the full state using a configurable reducer (defaulting to last-write-wins).

Edges are added with add_edge(source, target) for unconditional transitions and add_conditional_edges(source, condition_fn, mapping) for branches. The condition function receives the state and returns a string key that maps to a target node name. This enables routing logic: "if the agent decided to use a tool, go to the tool execution node; if it produced a final answer, go to the end node."

Persistence and checkpointing

LangGraph's checkpointing system saves the full state at every node execution. Checkpointers are pluggable backends: in-memory (for development), SQLite (for single-process persistence), and PostgreSQL (for production multi-process deployments). The checkpoint system enables:

Resumable workflows — a workflow interrupted by an error or timeout can be resumed from the last checkpoint
Human-in-the-loop — the graph can be paused (interrupt_before or interrupt_after on any node), the state can be inspected and modified externally, and execution resumes from the modified state
Time travel — LangSmith can replay a workflow from any historical checkpoint, enabling debugging of exact failure conditions

Multi-agent patterns in LangGraph

LangGraph supports two primary multi-agent patterns:

Supervisor pattern

A supervisor node (backed by an LLM) receives the overall goal and routes to specialist sub-graphs. Each specialist is itself a compiled LangGraph graph, called as a tool. The supervisor observes the specialist's output, decides whether it is satisfactory, and either routes to the next specialist or asks the same specialist to revise. This is the most controllable multi-agent pattern.

Hierarchical graph pattern

LangGraph supports subgraphs — compiled graphs used as nodes within a parent graph. The parent graph manages the overall workflow; each subgraph manages a self-contained sub-workflow. State can be passed between parent and subgraph via an input/output mapping. This allows genuinely modular agent systems where each sub-workflow is independently testable.

Streaming

LangGraph supports token-level streaming from model calls via graph.astream(). Both node outputs and intermediate model tokens can be streamed to the client, enabling responsive UIs that show progress as the agent works. This is significant for production applications where multi-step agent workflows may run for 30–120 seconds.

MCP integration

LangGraph supports MCP (Model Context Protocol) tool servers via the langchain-mcp-adapters package. Any MCP-compliant tool server can be loaded as a LangChain tool and used inside a LangGraph node. This means the full ecosystem of MCP tool providers (Claude Desktop integrations, third-party MCP servers) is available to LangGraph agents. Official documentation at python.langchain.com.

LangSmith — observability in production

LangSmith provides distributed tracing for LangChain and LangGraph applications. Each run generates a trace tree showing every node execution, model call, token count, latency, and cost. Traces are queryable and filterable. The evaluation framework allows automated testing against datasets with custom evaluators. LangSmith is the primary tool for identifying failure modes, measuring quality, and tracking costs in production LangGraph deployments. Pricing: free tier (5,000 traces/month), Developer ($39/user/month), Plus ($299/user/month). Official documentation at docs.smith.langchain.com.

Official documentation

LangChain Python docs: python.langchain.com
LangGraph docs: langchain-ai.github.io/langgraph
LangSmith docs: docs.smith.langchain.com
GitHub: github.com/langchain-ai/langgraph

Source note: All technical specifications in this guide are drawn from official LangChain and LangGraph documentation, LangSmith documentation, and the LangChain GitHub repository. Pricing figures are from the official LangSmith pricing page, verified April 2026.