AI Agents LangGraph

State in LangGraph

Intermediate

State in LangGraph

This post explores everything about State in LangGraph, how it works as shared memory between nodes, how data flows through graphs, and how state is updated, merged, and validated. We cover defining state with TypedDict and Pydantic , advanced concepts like conditional routing, loops, subgraphs, and multi-agent systems, plus performance optimization, common patterns, mistakes to avoid, and best practices for scalable LangGraph applications.

What Is State in LangGraph?

State is the central nervous system of any LangGraph application. It is a shared data structure that holds all the information needed during the execution of your graph. Every node in the graph receives the current state, performs some work, and returns updates that get merged back into the state. This makes LangGraph stateful — meaning it can remember information across multiple steps, loops, and even separate runs (when using a checkpointer). Think of State as a shared whiteboard that all nodes can read from and write to.

How State Works in LangGraph

State works through a define → update → merge cycle:
  1. You define a state schema (what data the graph should track).
  2. Each node receives a copy of the current state.
  3. The node returns a partial update (only the fields it wants to change).
  4. LangGraph merges the updates using reducers.
  5. The updated state is passed to the next node.
Basic Example
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

# 1. Define State Schema
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]   # Conversation history
    documents: list[dict]                      # Retrieved context
    iterations: int                            # Loop counter

# 2. Node that reads and updates state
def agent_node(state: AgentState):
    # Read from state
    history = state["messages"]
    docs = state.get("documents", [])
    
    # Call LLM with full context
    response = llm.invoke(history)
    
    # Return partial update
    return {
        "messages": [response],           # add_messages reducer will append
        "iterations": state.get("iterations", 0) + 1
    }

State as Shared Memory

State acts as shared memory across the entire graph execution:
def retriever_node(state: AgentState):
    query = state["messages"][-1].content
    docs = vector_store.similarity_search(query)
    return {"documents": docs}                    # Shared with all future nodes

def agent_node(state: AgentState):
    # Can access documents retrieved earlier
    context = state.get("documents", [])
    # ... use context for better response
Even in loops, the same state object (with accumulated data) is passed around. This is what allows agents to have long-term memory during execution.

State Flow Between Nodes

State flows in this pattern:
Node A → returns update → Reducer merges → Updated State → Node B
Important Rules:
  • Nodes never modify the state directly.
  • They return a dictionary of updates.
  • LangGraph handles the merging safely using reducers.
Example of State Flow:
def agent_node(state):
    return {"messages": [AIMessage(content="...")]}

def tools_node(state):
    return {"messages": [ToolMessage(content="...")]}

# State flows like this:
# Start: {"messages": [HumanMessage]}
# After agent: {"messages": [HumanMessage, AIMessage]}
# After tools: {"messages": [HumanMessage, AIMessage, ToolMessage]}

Reading and Updating State

Reading State

Nodes receive the full current state as the first argument:
def my_node(state: AgentState, config: RunnableConfig):
    # Reading examples
    last_message = state["messages"][-1]
    all_docs = state.get("documents", [])
    current_iteration = state.get("iterations", 0)
    
    # You can also access config
    thread_id = config["configurable"].get("thread_id")

Updating State

Always return a partial dictionary:
def my_node(state: AgentState):
    # Correct way - partial update
    return {
        "messages": [new_message],
        "documents": new_documents,
        "iterations": state.get("iterations", 0) + 1
    }

Never do this:

# Wrong - modifying state directly
state["messages"].append(new_message)   # Don't do this!
return state
State is not just data, it is the single source of truth for your entire graph execution.
Mastering state design is one of the most important skills when building reliable LangGraph applications.

Immutable vs Mutable Thinking

One of the most important mindset shifts when working with LangGraph is moving from mutable thinking to immutable thinking. Mutable Thinking (Traditional Python): You modify objects in place.
state["messages"].append(new_message)   # Wrong in LangGraph!
state["iterations"] += 1

Immutable Thinking (LangGraph Style):

You never modify the state directly. You return a new update that LangGraph merges for you.
# Correct way
return {
    "messages": [new_message],           # Partial update
    "iterations": state.get("iterations", 0) + 1
}

Why LangGraph enforces this:

  • Safe concurrent execution
  • Predictable reducer behavior
  • Better debugging and traceability
  • Works seamlessly with checkpointing and persistence
Rule of Thumb:
Treat state as read-only inside a node. Always return updates instead of mutating.

State Lifecycle in a Graph

The state follows a clear lifecycle during graph execution:
  1. Initialization
    Graph starts with the initial state you provide to .invoke() or .stream().
  2. Node Receives State
    Every node gets a copy of the current state.
  3. Node Processing
    Node reads the state and performs logic.
  4. Partial Update Returned
    Node returns only the fields it wants to change.
  5. Reducer Merging
    LangGraph merges the update using the defined reducers (e.g. add_messages).
  6. Updated State Passed Forward
    The new merged state is sent to the next node.
  7. Repeat (in cycles) or End at END.
Visual Lifecycle:
Initial State → Node A → Update → Reducer Merge → Updated State → Node B → ...
This cycle continues until the graph reaches END or an interrupt.

StateGraph and Shared State

StateGraph is designed around the concept of shared state — a single source of truth that all nodes can read from and contribute to.
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):
    documents: list[dict] = []
    iterations: int = 0
    final_answer: str | None = None

graph = StateGraph(AgentState)

# All nodes share the same state object (with proper merging)
graph.add_node("retriever", retriever_node)   # Writes documents
graph.add_node("agent", agent_node)           # Reads documents + messages
graph.add_node("tools", tool_node)            # Reads messages, writes tool results

graph.add_edge(START, "retriever")
graph.add_conditional_edges("agent", route_tools)
Key Point:
Even though each node receives its own view of the state, the merged result becomes the shared state for the next node. This creates powerful coordination between nodes.

Defining State with TypedDict

TypedDict is the most common and lightweight way to define state in LangGraph.
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    # Required fields with reducers
    messages: Annotated[list, add_messages]
    
    # Optional fields
    documents: Annotated[list[dict], lambda a, b: a + b]   # Custom reducer
    iterations: int
    confidence: float
    next: str | None
Advantages:
  • Simple and lightweight
  • Excellent IDE support
  • No extra dependencies
When to use: Most production graphs and tutorials.

Defining State with Dataclass

You can also use Python’s dataclass (less common but valid).
from dataclasses import dataclass, field
from typing import Annotated, Any
from langgraph.graph.message import add_messages

@dataclass
class AgentState:
    messages: Annotated[list, add_messages] = field(default_factory=list)
    documents: list[dict] = field(default_factory=list)
    iterations: int = field(default=0)
    confidence: float = field(default=0.0)
    metadata: dict[str, Any] = field(default_factory=dict)
Advantages:
  • Clean, familiar syntax
  • Good for simple cases
Limitations:
  • Less powerful validation than Pydantic
  • Slightly more verbose with reducers
Note: Most developers now prefer TypedDict (lightweight) or Pydantic (robust) over dataclasses.
Key Recommendation (2025+):
# Best balance for most projects
from pydantic import BaseModel, Field
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):      # Inherit from MessagesState for convenience
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)

TypedDict vs Pydantic vs Dataclass in LangGraph

Criteria TypedDict Pydantic (BaseModel) Dataclass
Recommended For Most common use cases Production, complex apps Simple cases
Type Checking Excellent (static) Excellent + Runtime validation Good
Default Values Manual Excellent ( Field(default=...) ) Good ( field(default=...) )
Reducers ( Annotated ) Best support Excellent Good
Validation None (static only) Very Strong (runtime) Basic
Serialization Manual Built-in ( model_dump() ) Manual
Performance Fastest Very Fast Fast
Boilerplate Low Medium Medium
IDE Support Excellent Best Good
LangGraph Compatibility Excellent Excellent Good
 
Best for: Most LangGraph applications, tutorials, and medium-complexity agents.
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    documents: Annotated[list[dict], lambda a, b: a + b]   # Custom reducer
    iterations: int
    confidence: float
    next: str | None
Pros:
  • Lightweight and simple
  • No extra dependencies
  • Great IDE autocomplete
  • Easy to understand
Cons:
  • No runtime validation
  • Default values are tricky
Best for: Production systems, large agents, when you need validation and defaults.
from pydantic import BaseModel, Field
from typing import Annotated
from langgraph.graph.message import add_messages

class AgentState(BaseModel):
    messages: Annotated[list, add_messages] = Field(default_factory=list)
    
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)
    confidence: float = Field(default=0.0, ge=0.0, le=1.0)  # Validation
    final_answer: str | None = Field(default=None)
    
    class Config:
        arbitrary_types_allowed = True
Pros:
  • Runtime validation
  • Excellent default values
  • Built-in serialization (model_dump())
  • Great error messages
  • Can inherit from MessagesState
Cons:
  • Slightly more overhead (still negligible)

3. Dataclass

Best for : Very simple graphs where you prefer class syntax.
from dataclasses import dataclass, field
from typing import Annotated, Any
from langgraph.graph.message import add_messages

@dataclass
class AgentState:
    messages: Annotated[list, add_messages] = field(default_factory=list)
    documents: list[dict] = field(default_factory=list)
    iterations: int = field(default=0)
    metadata: dict[str, Any] = field(default_factory=dict)
Pros:
  • Clean, familiar Python syntax
  • Good for small teams
Cons:
  • Weaker validation
  • More verbose with reducers
  • Less popular in LangGraph community
Modern Recommendation
For most users:
# Best balance - Inherit from MessagesState
from pydantic import BaseModel, Field
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):      # Recommended approach
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)
    confidence: float = Field(default=0.0)
Why?
  • You get all the benefits of MessagesState (proper message handling)
  • Plus Pydantic’s validation and defaults
  • Clean and scalable

State-Driven Routing

State-Driven Routing means making routing decisions (which node to go to next) based on the current state of the graph. This is the foundation of intelligent, dynamic workflows in LangGraph. Instead of hard-coded logic, the router inspects the shared state and decides the next step.

Using State in Conditional Edges

This is the most common and powerful use of state-driven routing.
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):
    documents: list[dict] = []
    iterations: int = 0
    confidence: float = 0.0
    task_complete: bool = False

def route_after_agent(state: AgentState):
    last_message = state["messages"][-1]
    iterations = state.get("iterations", 0)
    
    # Multiple state-based decisions
    if iterations >= 15:
        return "END"
    elif last_message.tool_calls:
        return "tools"
    elif state.get("confidence", 0) > 0.85 or state.get("task_complete"):
        return "END"
    elif len(state.get("documents", [])) == 0:
        return "retriever"
    else:
        return "agent"   # Continue loop


graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)
graph.add_node("retriever", retriever_node)

graph.add_conditional_edges("agent", route_after_agent)
Key Advantage: The router has full access to all accumulated state (messages, documents, counters, flags, etc.).
 

State in Multi-Agent Systems

In multi-agent setups, state acts as the shared context between agents.
class TeamState(MessagesState):
    documents: list[dict] = []
    iterations: int = 0
    current_agent: str = "supervisor"
    task_status: str = "in_progress"

def supervisor_router(state: TeamState):
    last = state["messages"][-1].content.lower()
    
    if "research" in last:
        return "research_agent"
    elif "code" in last:
        return "coder_agent"
    elif state.get("task_status") == "complete":
        return "END"
    else:
        return "critic_agent"


graph = StateGraph(TeamState)
graph.add_node("supervisor", supervisor_node)
graph.add_node("research_agent", research_agent)
graph.add_node("coder_agent", coder_agent)
graph.add_node("critic_agent", critic_agent)

graph.add_conditional_edges("supervisor", supervisor_router)

# All agents return to supervisor (shared state)
graph.add_edge("research_agent", "supervisor")
graph.add_edge("coder_agent", "supervisor")
graph.add_edge("critic_agent", "supervisor")
The shared state allows agents to collaborate effectively by reading and updating common information.

State in Subgraphs

Subgraphs can have their own state schema, but they often share or map to the parent state.
# Subgraph with its own focused state
class ResearchState(TypedDict):
    messages: Annotated[list, add_messages]
    query: str
    research_results: list[dict]

def create_research_subgraph():
    subgraph = StateGraph(ResearchState)
    subgraph.add_node("planner", planner_node)
    subgraph.add_node("researcher", researcher_node)
    subgraph.add_edge(START, "planner")
    subgraph.add_edge("planner", "researcher")
    subgraph.add_edge("researcher", END)
    return subgraph.compile()

# Main graph
class MainState(MessagesState):
    documents: list[dict] = []
    research_complete: bool = False

main_graph = StateGraph(MainState)
main_graph.add_node("research_team", create_research_subgraph())

Passing State Between Subgraphs

There are two main ways to pass state:
from langgraph.graph import StateGraph

class MainState(MessagesState):
    documents: list[dict] = []
    research_summary: str | None = None

# Subgraph input/output mapping
research_subgraph = create_research_subgraph().with_config(
    # Map main state to subgraph state
    input_mapping={"messages": "messages", "documents": "research_results"},
    output_mapping={"research_results": "documents"}
)

main_graph.add_node("research_team", research_subgraph)

2. Manual State Transformation

def research_entry_point(state: MainState):
    return {
        "messages": state["messages"],
        "query": state["messages"][-1].content
    }

def research_exit_point(state: ResearchState) -> MainState:
    return {
        "documents": state["research_results"],
        "research_summary": "Research completed"
    }

Performance Considerations

Large or frequently updated state can impact performance:
  • Deep copying happens on every node execution
  • Large message histories increase token usage and latency
  • Checkpointing large states increases storage cost
Optimization Tips:
  • Keep state minimal
  • Summarize old messages regularly
  • Use trim_messages or summarization nodes in loops

Large State Management

When your state becomes large:
class OptimizedState(MessagesState):
    messages: Annotated[list, add_messages]
    documents: list[dict] = Field(default_factory=list)
    # Instead of storing full results, store references
    document_ids: list[str] = Field(default_factory=list)
    summary: str | None = None          # Keep summarized version
 
Strategies:
  • Store references (IDs) instead of full objects
  • Use vector store + retriever instead of dumping everything into state
  • Periodically summarize conversation history

State Optimization Strategies

Best Practices:
  1. Use MessagesState as base when possible
  2. Separate concerns (conversation vs knowledge vs control)
  3. Summarize aggressively in long-running graphs
  4. Use Pydantic for better defaults and validation
  5. Avoid storing large binary data in state
Example Optimized State:
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    
    # Knowledge
    document_ids: list[str] = Field(default_factory=list)
    current_context_summary: str | None = None
    
    # Control
    iterations: int = Field(default=0)
    max_iterations: int = Field(default=15)
    
    # Output
    final_answer: str | None = Field(default=None)
Well-designed state = predictable, efficient, and maintainable agents.
State is not just data storage,  it is the architecture of your LangGraph application.

Common State Patterns

Here are the most effective and widely used patterns for designing state in LangGraph:

1. Metadata Pattern

Used to store control information, flags, and auxiliary data separate from core conversation.
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    
    # Metadata fields
    metadata: dict = Field(default_factory=dict)
    iterations: int = Field(default=0)
    confidence: float = Field(default=0.0)
    task_status: Literal["running", "complete", "failed"] = "running"
    last_tool_used: str | None = None
Usage:
def agent_node(state: AgentState):
    return {
        "messages": [response],
        "metadata": {**state.get("metadata", {}), "last_updated": "now"},
        "iterations": state.get("iterations", 0) + 1
    }

2. Scratchpad Pattern

A temporary workspace for intermediate thoughts, plans, or calculations.
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    scratchpad: str = Field(default="")           # Agent's private notes
    plan: list[str] = Field(default_factory=list) # Step-by-step plan
    working_memory: dict = Field(default_factory=dict)
Example Usage:
def planner_node(state: AgentState):
    plan = llm.invoke(f"Create a plan for: {state['messages'][-1].content}")
    return {
        "scratchpad": plan.content,
        "plan": plan.content.split("\n")
    }
This pattern helps agents maintain internal reasoning separate from final output.

3. Tool Result Pattern

Structured way to store and manage tool execution results.
class ToolResult(BaseModel):
    tool_name: str
    input: dict
    output: str
    success: bool
    timestamp: str

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    tool_results: list[ToolResult] = Field(default_factory=list)
    tool_history: Annotated[list, add] = Field(default_factory=list)
Usage:
def tools_node(state: AgentState):
    results = execute_tools(...)
    tool_result = ToolResult(...)
    
    return {
        "tool_results": [tool_result],
        "tool_history": [tool_result]
    }

4. Memory-Augmented State

Combining short-term (messages) and long-term memory.
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    
    # Long-term memory
    long_term_memory: list[dict] = Field(default_factory=list)
    user_profile: dict = Field(default_factory=dict)
    
    # Summarized history
    conversation_summary: str | None = Field(default=None)
This pattern is essential for long-running agents.

Common Mistakes with State

1. Overloading State

Putting too many unrelated things into state.
Bad Example:
class BadState(TypedDict):
    data: dict                    # Everything dumped here
    temp: Any
    cache: dict
    everything: list
Problem: Hard to understand, debug, and maintain.

2. Poor State Naming

Using vague names like data , result , info , stuff .
Bad:
result: list
info: dict
temp_data: Any
Good:
retrieved_documents: list[dict]
current_plan: list[str]
user_preferences: dict

3. Mutable State Confusion

Trying to mutate state directly inside nodes.
# Wrong
def bad_node(state):
    state["messages"].append(new_msg)   # Don't do this!
    return state
Correct:
def good_node(state):
    return {"messages": [new_msg]}   # Let reducer handle it

4. Storing Too Much Data

Storing large objects, full documents, or binary data directly in state.
Bad:
full_documents: list[Document]   # Very large
raw_api_responses: list[dict]

Better: Store references or summaries instead.

5. Unstructured State Design

No clear organization or separation of concerns.
Bad:
class MessyState(TypedDict):
    x: Any
    y: Any
    z: Any
Good:
class CleanState(MessagesState):
    messages: Annotated[list, add_messages]
    knowledge: list[dict]
    control: dict
    output: dict

Best Practices for State Design

1. Keep State Minimal

Only include fields that are actively used during graph execution.
# Good
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)
Avoid storing full database results or large raw data.

2. Use Typed State

Always define your state with types.
Preferred (Pydantic):
class AgentState(BaseModel):
    messages: Annotated[list, add_messages] = Field(default_factory=list)
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)

3. Separate Messages from Metadata

class AgentState(MessagesState):
    # Conversation
    messages: Annotated[list, add_messages]
    
    # Metadata / Control
    metadata: dict = Field(default_factory=dict)
    control: dict = Field(default_factory=dict)
    
    # Knowledge
    documents: list[dict] = Field(default_factory=list)
This separation improves clarity and maintainability.

4. Design Clear State Flows

Document which nodes read/write which fields.
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]           # Written by: agent, tools
    documents: list[dict] = Field(default_factory=list) # Written by: retriever
    iterations: int = Field(default=0)                # Written by: agent
    final_answer: str | None = Field(default=None)    # Written by: final node

5. Use Reducers Carefully

Only use custom reducers when necessary.
# Good - Simple and clear
messages: Annotated[list, add_messages]
documents: Annotated[list[dict], lambda a, b: a + b]

# Avoid over-engineering
# custom_reducer_that_does_too_much
Final Recommendation:
State should be like a well-organized desk — everything has its place, and only necessary items are kept on it.
A clean, well-designed state makes your entire LangGraph application more reliable, debuggable, and scalable.

AI agent LangChain LangGraph Python

← All training