AI Agents LangGraph

State in LangGraph

Intermediate

This post explores everything about State in LangGraph, how it works as shared memory between nodes, how data flows through graphs, and how state is updated, merged, and validated. We cover defining state with TypedDict and Pydantic , advanced concepts like conditional routing, loops, subgraphs, and multi-agent systems, plus performance optimization, common patterns, mistakes to avoid, and best practices for scalable LangGraph applications.

What Is State in LangGraph?

State is the central nervous system of any LangGraph application. It is a shared data structure that holds all the information needed during the execution of your graph. Every node in the graph receives the current state, performs some work, and returns updates that get merged back into the state. This makes LangGraph stateful — meaning it can remember information across multiple steps, loops, and even separate runs (when using a checkpointer). Think of State as a shared whiteboard that all nodes can read from and write to.

How State Works in LangGraph

State works through a define → update → merge cycle:

You define a state schema (what data the graph should track).
Each node receives a copy of the current state.
The node returns a partial update (only the fields it wants to change).
LangGraph merges the updates using reducers.
The updated state is passed to the next node.

Basic Example

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

# 1. Define State Schema
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]   # Conversation history
    documents: list[dict]                      # Retrieved context
    iterations: int                            # Loop counter

# 2. Node that reads and updates state
def agent_node(state: AgentState):
    # Read from state
    history = state["messages"]
    docs = state.get("documents", [])
    
    # Call LLM with full context
    response = llm.invoke(history)
    
    # Return partial update
    return {
        "messages": [response],           # add_messages reducer will append
        "iterations": state.get("iterations", 0) + 1
    }

State as Shared Memory

State acts as shared memory across the entire graph execution:

def retriever_node(state: AgentState):
    query = state["messages"][-1].content
    docs = vector_store.similarity_search(query)
    return {"documents": docs}                    # Shared with all future nodes

def agent_node(state: AgentState):
    # Can access documents retrieved earlier
    context = state.get("documents", [])
    # ... use context for better response

Even in loops, the same state object (with accumulated data) is passed around. This is what allows agents to have long-term memory during execution.

State Flow Between Nodes

State flows in this pattern:

Node A → returns update → Reducer merges → Updated State → Node B

Important Rules:

Nodes never modify the state directly.
They return a dictionary of updates.
LangGraph handles the merging safely using reducers.

Example of State Flow:

def agent_node(state):
    return {"messages": [AIMessage(content="...")]}

def tools_node(state):
    return {"messages": [ToolMessage(content="...")]}

# State flows like this:
# Start: {"messages": [HumanMessage]}
# After agent: {"messages": [HumanMessage, AIMessage]}
# After tools: {"messages": [HumanMessage, AIMessage, ToolMessage]}

Reading and Updating State

Reading State

Nodes receive the full current state as the first argument:

def my_node(state: AgentState, config: RunnableConfig):
    # Reading examples
    last_message = state["messages"][-1]
    all_docs = state.get("documents", [])
    current_iteration = state.get("iterations", 0)
    
    # You can also access config
    thread_id = config["configurable"].get("thread_id")

Updating State

Always return a partial dictionary:

def my_node(state: AgentState):
    # Correct way - partial update
    return {
        "messages": [new_message],
        "documents": new_documents,
        "iterations": state.get("iterations", 0) + 1
    }

Never do this:

# Wrong - modifying state directly
state["messages"].append(new_message)   # Don't do this!
return state

State is not just data, it is the single source of truth for your entire graph execution.

Mastering state design is one of the most important skills when building reliable LangGraph applications.

Immutable vs Mutable Thinking

One of the most important mindset shifts when working with LangGraph is moving from mutable thinking to immutable thinking. Mutable Thinking (Traditional Python): You modify objects in place.

state["messages"].append(new_message)   # Wrong in LangGraph!
state["iterations"] += 1

Immutable Thinking (LangGraph Style):

You never modify the state directly. You return a new update that LangGraph merges for you.

# Correct way
return {
    "messages": [new_message],           # Partial update
    "iterations": state.get("iterations", 0) + 1
}

Why LangGraph enforces this:

Safe concurrent execution
Predictable reducer behavior
Better debugging and traceability
Works seamlessly with checkpointing and persistence

Rule of Thumb:

Treat state as read-only inside a node. Always return updates instead of mutating.

State Lifecycle in a Graph

The state follows a clear lifecycle during graph execution:

Initialization
Graph starts with the initial state you provide to .invoke() or .stream().
Node Receives State
Every node gets a copy of the current state.
Node Processing
Node reads the state and performs logic.
Partial Update Returned
Node returns only the fields it wants to change.
Reducer Merging
LangGraph merges the update using the defined reducers (e.g. add_messages).
Updated State Passed Forward
The new merged state is sent to the next node.
Repeat (in cycles) or End at END.

Visual Lifecycle:

Initial State → Node A → Update → Reducer Merge → Updated State → Node B → ...

This cycle continues until the graph reaches END or an interrupt.

StateGraph and Shared State



          StateGraph

is designed around the concept of shared state — a single source of truth that all nodes can read from and contribute to.

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):
    documents: list[dict] = []
    iterations: int = 0
    final_answer: str | None = None

graph = StateGraph(AgentState)

# All nodes share the same state object (with proper merging)
graph.add_node("retriever", retriever_node)   # Writes documents
graph.add_node("agent", agent_node)           # Reads documents + messages
graph.add_node("tools", tool_node)            # Reads messages, writes tool results

graph.add_edge(START, "retriever")
graph.add_conditional_edges("agent", route_tools)

Key Point:
Even though each node receives its own view of the state, the merged result becomes the shared state for the next node. This creates powerful coordination between nodes.

Defining State with TypedDict


         TypedDict

is the most common and lightweight way to define state in LangGraph.

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    # Required fields with reducers
    messages: Annotated[list, add_messages]
    
    # Optional fields
    documents: Annotated[list[dict], lambda a, b: a + b]   # Custom reducer
    iterations: int
    confidence: float
    next: str | None

Advantages:

Simple and lightweight
Excellent IDE support
No extra dependencies

When to use: Most production graphs and tutorials.

Defining State with Dataclass

You can also use Python’s dataclass (less common but valid).

from dataclasses import dataclass, field
from typing import Annotated, Any
from langgraph.graph.message import add_messages

@dataclass
class AgentState:
    messages: Annotated[list, add_messages] = field(default_factory=list)
    documents: list[dict] = field(default_factory=list)
    iterations: int = field(default=0)
    confidence: float = field(default=0.0)
    metadata: dict[str, Any] = field(default_factory=dict)

Advantages:

Clean, familiar syntax
Good for simple cases

Limitations:

Less powerful validation than Pydantic
Slightly more verbose with reducers

Note: Most developers now prefer TypedDict (lightweight) or Pydantic (robust) over dataclasses.

Key Recommendation (2025+):

# Best balance for most projects
from pydantic import BaseModel, Field
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):      # Inherit from MessagesState for convenience
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)

TypedDict vs Pydantic vs Dataclass in LangGraph

Criteria	TypedDict	Pydantic (BaseModel)	Dataclass
Recommended For	Most common use cases	Production, complex apps	Simple cases
Type Checking	Excellent (static)	Excellent + Runtime validation	Good
Default Values	Manual	Excellent ( `Field(default=...)` )	Good ( `field(default=...)` )
Reducers ( `Annotated` )	Best support	Excellent	Good
Validation	None (static only)	Very Strong (runtime)	Basic
Serialization	Manual	Built-in ( `model_dump()` )	Manual
Performance	Fastest	Very Fast	Fast
Boilerplate	Low	Medium	Medium
IDE Support	Excellent	Best	Good
LangGraph Compatibility	Excellent	Excellent	Good

1. TypedDict (Most Popular)

Best for: Most LangGraph applications, tutorials, and medium-complexity agents.

from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    documents: Annotated[list[dict], lambda a, b: a + b]   # Custom reducer
    iterations: int
    confidence: float
    next: str | None

Pros:

Lightweight and simple
No extra dependencies
Great IDE autocomplete
Easy to understand

Cons:

No runtime validation
Default values are tricky

2. Pydantic (Recommended for Production)

Best for: Production systems, large agents, when you need validation and defaults.

from pydantic import BaseModel, Field
from typing import Annotated
from langgraph.graph.message import add_messages

class AgentState(BaseModel):
    messages: Annotated[list, add_messages] = Field(default_factory=list)
    
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)
    confidence: float = Field(default=0.0, ge=0.0, le=1.0)  # Validation
    final_answer: str | None = Field(default=None)
    
    class Config:
        arbitrary_types_allowed = True

Pros:

Runtime validation
Excellent default values
Built-in serialization (model_dump())
Great error messages
Can inherit from MessagesState

Cons:

Slightly more overhead (still negligible)

3. Dataclass

Best for : Very simple graphs where you prefer class syntax.

from dataclasses import dataclass, field
from typing import Annotated, Any
from langgraph.graph.message import add_messages

@dataclass
class AgentState:
    messages: Annotated[list, add_messages] = field(default_factory=list)
    documents: list[dict] = field(default_factory=list)
    iterations: int = field(default=0)
    metadata: dict[str, Any] = field(default_factory=dict)

Pros:

Clean, familiar Python syntax
Good for small teams

Cons:

Weaker validation
More verbose with reducers
Less popular in LangGraph community

Modern Recommendation

For most users:

# Best balance - Inherit from MessagesState
from pydantic import BaseModel, Field
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):      # Recommended approach
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)
    confidence: float = Field(default=0.0)

Why?

You get all the benefits of MessagesState (proper message handling)
Plus Pydantic’s validation and defaults
Clean and scalable

State-Driven Routing

State-Driven Routing means making routing decisions (which node to go to next) based on the current state of the graph. This is the foundation of intelligent, dynamic workflows in LangGraph. Instead of hard-coded logic, the router inspects the shared state and decides the next step.

Using State in Conditional Edges

This is the most common and powerful use of state-driven routing.

from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import MessagesState

class AgentState(MessagesState):
    documents: list[dict] = []
    iterations: int = 0
    confidence: float = 0.0
    task_complete: bool = False

def route_after_agent(state: AgentState):
    last_message = state["messages"][-1]
    iterations = state.get("iterations", 0)
    
    # Multiple state-based decisions
    if iterations >= 15:
        return "END"
    elif last_message.tool_calls:
        return "tools"
    elif state.get("confidence", 0) > 0.85 or state.get("task_complete"):
        return "END"
    elif len(state.get("documents", [])) == 0:
        return "retriever"
    else:
        return "agent"   # Continue loop


graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)
graph.add_node("retriever", retriever_node)

graph.add_conditional_edges("agent", route_after_agent)

Key Advantage: The router has full access to all accumulated state (messages, documents, counters, flags, etc.).

State in Multi-Agent Systems

In multi-agent setups, state acts as the shared context between agents.

class TeamState(MessagesState):
    documents: list[dict] = []
    iterations: int = 0
    current_agent: str = "supervisor"
    task_status: str = "in_progress"

def supervisor_router(state: TeamState):
    last = state["messages"][-1].content.lower()
    
    if "research" in last:
        return "research_agent"
    elif "code" in last:
        return "coder_agent"
    elif state.get("task_status") == "complete":
        return "END"
    else:
        return "critic_agent"


graph = StateGraph(TeamState)
graph.add_node("supervisor", supervisor_node)
graph.add_node("research_agent", research_agent)
graph.add_node("coder_agent", coder_agent)
graph.add_node("critic_agent", critic_agent)

graph.add_conditional_edges("supervisor", supervisor_router)

# All agents return to supervisor (shared state)
graph.add_edge("research_agent", "supervisor")
graph.add_edge("coder_agent", "supervisor")
graph.add_edge("critic_agent", "supervisor")

The shared state allows agents to collaborate effectively by reading and updating common information.

State in Subgraphs

Subgraphs can have their own state schema, but they often share or map to the parent state.

# Subgraph with its own focused state
class ResearchState(TypedDict):
    messages: Annotated[list, add_messages]
    query: str
    research_results: list[dict]

def create_research_subgraph():
    subgraph = StateGraph(ResearchState)
    subgraph.add_node("planner", planner_node)
    subgraph.add_node("researcher", researcher_node)
    subgraph.add_edge(START, "planner")
    subgraph.add_edge("planner", "researcher")
    subgraph.add_edge("researcher", END)
    return subgraph.compile()

# Main graph
class MainState(MessagesState):
    documents: list[dict] = []
    research_complete: bool = False

main_graph = StateGraph(MainState)
main_graph.add_node("research_team", create_research_subgraph())

Passing State Between Subgraphs

There are two main ways to pass state:

1. Automatic State Mapping (Recommended)

from langgraph.graph import StateGraph

class MainState(MessagesState):
    documents: list[dict] = []
    research_summary: str | None = None

# Subgraph input/output mapping
research_subgraph = create_research_subgraph().with_config(
    # Map main state to subgraph state
    input_mapping={"messages": "messages", "documents": "research_results"},
    output_mapping={"research_results": "documents"}
)

main_graph.add_node("research_team", research_subgraph)

2. Manual State Transformation

def research_entry_point(state: MainState):
    return {
        "messages": state["messages"],
        "query": state["messages"][-1].content
    }

def research_exit_point(state: ResearchState) -> MainState:
    return {
        "documents": state["research_results"],
        "research_summary": "Research completed"
    }

Performance Considerations

Large or frequently updated state can impact performance:

Deep copying happens on every node execution
Large message histories increase token usage and latency
Checkpointing large states increases storage cost

Optimization Tips:

Keep state minimal
Summarize old messages regularly
Use trim_messages or summarization nodes in loops

Large State Management

When your state becomes large:

class OptimizedState(MessagesState):
    messages: Annotated[list, add_messages]
    documents: list[dict] = Field(default_factory=list)
    # Instead of storing full results, store references
    document_ids: list[str] = Field(default_factory=list)
    summary: str | None = None          # Keep summarized version

Strategies:

Store references (IDs) instead of full objects
Use vector store + retriever instead of dumping everything into state
Periodically summarize conversation history

State Optimization Strategies

Best Practices:

Use MessagesState as base when possible
Separate concerns (conversation vs knowledge vs control)
Summarize aggressively in long-running graphs
Use Pydantic for better defaults and validation
Avoid storing large binary data in state

Example Optimized State:

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    
    # Knowledge
    document_ids: list[str] = Field(default_factory=list)
    current_context_summary: str | None = None
    
    # Control
    iterations: int = Field(default=0)
    max_iterations: int = Field(default=15)
    
    # Output
    final_answer: str | None = Field(default=None)

Well-designed state = predictable, efficient, and maintainable agents.

State is not just data storage, it is the architecture of your LangGraph application.

Common State Patterns

Here are the most effective and widely used patterns for designing state in LangGraph:

1. Metadata Pattern

Used to store control information, flags, and auxiliary data separate from core conversation.

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    
    # Metadata fields
    metadata: dict = Field(default_factory=dict)
    iterations: int = Field(default=0)
    confidence: float = Field(default=0.0)
    task_status: Literal["running", "complete", "failed"] = "running"
    last_tool_used: str | None = None

Usage:

def agent_node(state: AgentState):
    return {
        "messages": [response],
        "metadata": {**state.get("metadata", {}), "last_updated": "now"},
        "iterations": state.get("iterations", 0) + 1
    }

2. Scratchpad Pattern

A temporary workspace for intermediate thoughts, plans, or calculations.

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    scratchpad: str = Field(default="")           # Agent's private notes
    plan: list[str] = Field(default_factory=list) # Step-by-step plan
    working_memory: dict = Field(default_factory=dict)

Example Usage:

def planner_node(state: AgentState):
    plan = llm.invoke(f"Create a plan for: {state['messages'][-1].content}")
    return {
        "scratchpad": plan.content,
        "plan": plan.content.split("\n")
    }

This pattern helps agents maintain internal reasoning separate from final output.

3. Tool Result Pattern

Structured way to store and manage tool execution results.

class ToolResult(BaseModel):
    tool_name: str
    input: dict
    output: str
    success: bool
    timestamp: str

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    tool_results: list[ToolResult] = Field(default_factory=list)
    tool_history: Annotated[list, add] = Field(default_factory=list)

Usage:

def tools_node(state: AgentState):
    results = execute_tools(...)
    tool_result = ToolResult(...)
    
    return {
        "tool_results": [tool_result],
        "tool_history": [tool_result]
    }

4. Memory-Augmented State

Combining short-term (messages) and long-term memory.

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    
    # Long-term memory
    long_term_memory: list[dict] = Field(default_factory=list)
    user_profile: dict = Field(default_factory=dict)
    
    # Summarized history
    conversation_summary: str | None = Field(default=None)

This pattern is essential for long-running agents.

Common Mistakes with State

1. Overloading State

Putting too many unrelated things into state.

Bad Example:

class BadState(TypedDict):
    data: dict                    # Everything dumped here
    temp: Any
    cache: dict
    everything: list

Problem: Hard to understand, debug, and maintain.

2. Poor State Naming

Using vague names like


           data


           result


           info


           stuff

Bad:

result: list
info: dict
temp_data: Any

Good:

retrieved_documents: list[dict]
current_plan: list[str]
user_preferences: dict

3. Mutable State Confusion

Trying to mutate state directly inside nodes.

# Wrong
def bad_node(state):
    state["messages"].append(new_msg)   # Don't do this!
    return state

Correct:

def good_node(state):
    return {"messages": [new_msg]}   # Let reducer handle it

4. Storing Too Much Data

Storing large objects, full documents, or binary data directly in state.

Bad:

full_documents: list[Document]   # Very large
raw_api_responses: list[dict]

Better: Store references or summaries instead.

5. Unstructured State Design

No clear organization or separation of concerns.

Bad:

class MessyState(TypedDict):
    x: Any
    y: Any
    z: Any

Good:

class CleanState(MessagesState):
    messages: Annotated[list, add_messages]
    knowledge: list[dict]
    control: dict
    output: dict

Best Practices for State Design

1. Keep State Minimal

Only include fields that are actively used during graph execution.

# Good
class AgentState(MessagesState):
    messages: Annotated[list, add_messages]
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)

Avoid storing full database results or large raw data.

2. Use Typed State

Always define your state with types.

Preferred (Pydantic):

class AgentState(BaseModel):
    messages: Annotated[list, add_messages] = Field(default_factory=list)
    documents: list[dict] = Field(default_factory=list)
    iterations: int = Field(default=0)

3. Separate Messages from Metadata

class AgentState(MessagesState):
    # Conversation
    messages: Annotated[list, add_messages]
    
    # Metadata / Control
    metadata: dict = Field(default_factory=dict)
    control: dict = Field(default_factory=dict)
    
    # Knowledge
    documents: list[dict] = Field(default_factory=list)

This separation improves clarity and maintainability.

4. Design Clear State Flows

Document which nodes read/write which fields.

class AgentState(MessagesState):
    messages: Annotated[list, add_messages]           # Written by: agent, tools
    documents: list[dict] = Field(default_factory=list) # Written by: retriever
    iterations: int = Field(default=0)                # Written by: agent
    final_answer: str | None = Field(default=None)    # Written by: final node

5. Use Reducers Carefully

Only use custom reducers when necessary.

# Good - Simple and clear
messages: Annotated[list, add_messages]
documents: Annotated[list[dict], lambda a, b: a + b]

# Avoid over-engineering
# custom_reducer_that_does_too_much

Final Recommendation:

State should be like a well-organized desk — everything has its place, and only necessary items are kept on it.

A clean, well-designed state makes your entire LangGraph application more reliable, debuggable, and scalable.

AI agent LangChain LangGraph Python

← All training