AI Agents LangGraph
State in LangGraph
Intermediate
This post explores everything about
State in LangGraph,
how it works as shared memory between nodes, how data flows through graphs, and how state is updated, merged, and validated. We cover defining state with
TypedDict
and
Pydantic
, advanced concepts like conditional routing, loops, subgraphs, and multi-agent systems, plus performance optimization, common patterns, mistakes to avoid, and best practices for scalable LangGraph applications.
What Is State in LangGraph?
State is the central nervous system of any LangGraph application. It is a shared data structure that holds all the information needed during the execution of your graph.
Every node in the graph receives the current state, performs some work, and returns updates that get merged back into the state. This makes LangGraph stateful — meaning it can remember information across multiple steps, loops, and even separate runs (when using a checkpointer).
Think of State as a shared whiteboard that all nodes can read from and write to.
How State Works in LangGraph
State works through a define → update → merge cycle:
- You define a state schema (what data the graph should track).
- Each node receives a copy of the current state.
- The node returns a partial update (only the fields it wants to change).
- LangGraph merges the updates using reducers.
- The updated state is passed to the next node.
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
# 1. Define State Schema
class AgentState(TypedDict):
messages: Annotated[list, add_messages] # Conversation history
documents: list[dict] # Retrieved context
iterations: int # Loop counter
# 2. Node that reads and updates state
def agent_node(state: AgentState):
# Read from state
history = state["messages"]
docs = state.get("documents", [])
# Call LLM with full context
response = llm.invoke(history)
# Return partial update
return {
"messages": [response], # add_messages reducer will append
"iterations": state.get("iterations", 0) + 1
}
State as Shared Memory
State acts as shared memory across the entire graph execution:
def retriever_node(state: AgentState):
query = state["messages"][-1].content
docs = vector_store.similarity_search(query)
return {"documents": docs} # Shared with all future nodes
def agent_node(state: AgentState):
# Can access documents retrieved earlier
context = state.get("documents", [])
# ... use context for better response
Even in loops, the same state object (with accumulated data) is passed around. This is what allows agents to have
long-term memory
during execution.
State Flow Between Nodes
State flows in this pattern:
Node A → returns update → Reducer merges → Updated State → Node B
Important Rules:
- Nodes never modify the state directly.
- They return a dictionary of updates.
- LangGraph handles the merging safely using reducers.
Example of State Flow:
def agent_node(state):
return {"messages": [AIMessage(content="...")]}
def tools_node(state):
return {"messages": [ToolMessage(content="...")]}
# State flows like this:
# Start: {"messages": [HumanMessage]}
# After agent: {"messages": [HumanMessage, AIMessage]}
# After tools: {"messages": [HumanMessage, AIMessage, ToolMessage]}
Reading and Updating State
Reading State
Nodes receive the full current state as the first argument:
def my_node(state: AgentState, config: RunnableConfig):
# Reading examples
last_message = state["messages"][-1]
all_docs = state.get("documents", [])
current_iteration = state.get("iterations", 0)
# You can also access config
thread_id = config["configurable"].get("thread_id")
Updating State
Always return a partial dictionary:
def my_node(state: AgentState):
# Correct way - partial update
return {
"messages": [new_message],
"documents": new_documents,
"iterations": state.get("iterations", 0) + 1
}
Never do this:
# Wrong - modifying state directly
state["messages"].append(new_message) # Don't do this!
return state
State is not just data, it is the single source of truth for your entire graph execution.
Mastering state design is one of the most important skills when building reliable LangGraph applications.
Immutable vs Mutable Thinking
One of the most important mindset shifts when working with LangGraph is moving from mutable thinking to immutable thinking.
Mutable Thinking (Traditional Python): You modify objects in place.
state["messages"].append(new_message) # Wrong in LangGraph!
state["iterations"] += 1
Immutable Thinking (LangGraph Style):
You never modify the state directly. You return a new update that LangGraph merges for you.
# Correct way
return {
"messages": [new_message], # Partial update
"iterations": state.get("iterations", 0) + 1
}
Why LangGraph enforces this:
- Safe concurrent execution
- Predictable reducer behavior
- Better debugging and traceability
- Works seamlessly with checkpointing and persistence
Rule of Thumb:
Treat state as
read-only
inside a node. Always return updates instead of mutating.
State Lifecycle in a Graph
The state follows a clear lifecycle during graph execution:
-
Initialization
Graph starts with the initial state you provide to .invoke() or .stream(). -
Node Receives State
Every node gets a copy of the current state. -
Node Processing
Node reads the state and performs logic. -
Partial Update Returned
Node returns only the fields it wants to change. -
Reducer Merging
LangGraph merges the update using the defined reducers (e.g. add_messages). -
Updated State Passed Forward
The new merged state is sent to the next node. - Repeat (in cycles) or End at END.
Initial State → Node A → Update → Reducer Merge → Updated State → Node B → ...
This cycle continues until the graph reaches END or an interrupt.
StateGraph and Shared State
StateGraph
is designed around the concept of
shared state
— a single source of truth that all nodes can read from and contribute to.
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import MessagesState
class AgentState(MessagesState):
documents: list[dict] = []
iterations: int = 0
final_answer: str | None = None
graph = StateGraph(AgentState)
# All nodes share the same state object (with proper merging)
graph.add_node("retriever", retriever_node) # Writes documents
graph.add_node("agent", agent_node) # Reads documents + messages
graph.add_node("tools", tool_node) # Reads messages, writes tool results
graph.add_edge(START, "retriever")
graph.add_conditional_edges("agent", route_tools)
Key Point:
Even though each node receives its own view of the state, the merged result becomes the shared state for the next node. This creates powerful coordination between nodes.
Even though each node receives its own view of the state, the merged result becomes the shared state for the next node. This creates powerful coordination between nodes.
Defining State with TypedDict
TypedDict
is the most common and lightweight way to define state in LangGraph.
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
class AgentState(TypedDict):
# Required fields with reducers
messages: Annotated[list, add_messages]
# Optional fields
documents: Annotated[list[dict], lambda a, b: a + b] # Custom reducer
iterations: int
confidence: float
next: str | None
Advantages:
- Simple and lightweight
- Excellent IDE support
- No extra dependencies
Defining State with Dataclass
You can also use Python’s dataclass (less common but valid).
from dataclasses import dataclass, field
from typing import Annotated, Any
from langgraph.graph.message import add_messages
@dataclass
class AgentState:
messages: Annotated[list, add_messages] = field(default_factory=list)
documents: list[dict] = field(default_factory=list)
iterations: int = field(default=0)
confidence: float = field(default=0.0)
metadata: dict[str, Any] = field(default_factory=dict)
Advantages:
- Clean, familiar syntax
- Good for simple cases
- Less powerful validation than Pydantic
- Slightly more verbose with reducers
Key Recommendation (2025+):
# Best balance for most projects
from pydantic import BaseModel, Field
from langgraph.graph.message import MessagesState
class AgentState(MessagesState): # Inherit from MessagesState for convenience
documents: list[dict] = Field(default_factory=list)
iterations: int = Field(default=0)
TypedDict vs Pydantic vs Dataclass in LangGraph
| Criteria | TypedDict | Pydantic (BaseModel) | Dataclass |
|---|---|---|---|
| Recommended For | Most common use cases | Production, complex apps | Simple cases |
| Type Checking | Excellent (static) | Excellent + Runtime validation | Good |
| Default Values | Manual |
Excellent (
Field(default=...)
)
|
Good (
field(default=...)
)
|
Reducers (
Annotated
)
|
Best support | Excellent | Good |
| Validation | None (static only) | Very Strong (runtime) | Basic |
| Serialization | Manual |
Built-in (
model_dump()
)
|
Manual |
| Performance | Fastest | Very Fast | Fast |
| Boilerplate | Low | Medium | Medium |
| IDE Support | Excellent | Best | Good |
| LangGraph Compatibility | Excellent | Excellent | Good |
1. TypedDict (Most Popular)
Best for: Most LangGraph applications, tutorials, and medium-complexity agents.
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
class AgentState(TypedDict):
messages: Annotated[list, add_messages]
documents: Annotated[list[dict], lambda a, b: a + b] # Custom reducer
iterations: int
confidence: float
next: str | None
Pros:
- Lightweight and simple
- No extra dependencies
- Great IDE autocomplete
- Easy to understand
- No runtime validation
- Default values are tricky
2. Pydantic (Recommended for Production)
Best for: Production systems, large agents, when you need validation and defaults.
from pydantic import BaseModel, Field
from typing import Annotated
from langgraph.graph.message import add_messages
class AgentState(BaseModel):
messages: Annotated[list, add_messages] = Field(default_factory=list)
documents: list[dict] = Field(default_factory=list)
iterations: int = Field(default=0)
confidence: float = Field(default=0.0, ge=0.0, le=1.0) # Validation
final_answer: str | None = Field(default=None)
class Config:
arbitrary_types_allowed = True
Pros:
- Runtime validation
- Excellent default values
- Built-in serialization (model_dump())
- Great error messages
- Can inherit from MessagesState
- Slightly more overhead (still negligible)
3. Dataclass
Best for
:
Very simple graphs where you prefer class syntax.
from dataclasses import dataclass, field
from typing import Annotated, Any
from langgraph.graph.message import add_messages
@dataclass
class AgentState:
messages: Annotated[list, add_messages] = field(default_factory=list)
documents: list[dict] = field(default_factory=list)
iterations: int = field(default=0)
metadata: dict[str, Any] = field(default_factory=dict)
Pros:
- Clean, familiar Python syntax
- Good for small teams
- Weaker validation
- More verbose with reducers
- Less popular in LangGraph community
Modern Recommendation
For most users:
# Best balance - Inherit from MessagesState
from pydantic import BaseModel, Field
from langgraph.graph.message import MessagesState
class AgentState(MessagesState): # Recommended approach
documents: list[dict] = Field(default_factory=list)
iterations: int = Field(default=0)
confidence: float = Field(default=0.0)
Why?
- You get all the benefits of MessagesState (proper message handling)
- Plus Pydantic’s validation and defaults
- Clean and scalable
State-Driven Routing
State-Driven Routing means making routing decisions (which node to go to next) based on the current state of the graph. This is the foundation of intelligent, dynamic workflows in LangGraph.
Instead of hard-coded logic, the router inspects the shared state and decides the next step.
Using State in Conditional Edges
This is the most common and powerful use of state-driven routing.
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import MessagesState
class AgentState(MessagesState):
documents: list[dict] = []
iterations: int = 0
confidence: float = 0.0
task_complete: bool = False
def route_after_agent(state: AgentState):
last_message = state["messages"][-1]
iterations = state.get("iterations", 0)
# Multiple state-based decisions
if iterations >= 15:
return "END"
elif last_message.tool_calls:
return "tools"
elif state.get("confidence", 0) > 0.85 or state.get("task_complete"):
return "END"
elif len(state.get("documents", [])) == 0:
return "retriever"
else:
return "agent" # Continue loop
graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)
graph.add_node("retriever", retriever_node)
graph.add_conditional_edges("agent", route_after_agent)
Key Advantage: The router has full access to all accumulated state (messages, documents, counters, flags, etc.).
State in Multi-Agent Systems
In multi-agent setups, state acts as the shared context between agents.
class TeamState(MessagesState):
documents: list[dict] = []
iterations: int = 0
current_agent: str = "supervisor"
task_status: str = "in_progress"
def supervisor_router(state: TeamState):
last = state["messages"][-1].content.lower()
if "research" in last:
return "research_agent"
elif "code" in last:
return "coder_agent"
elif state.get("task_status") == "complete":
return "END"
else:
return "critic_agent"
graph = StateGraph(TeamState)
graph.add_node("supervisor", supervisor_node)
graph.add_node("research_agent", research_agent)
graph.add_node("coder_agent", coder_agent)
graph.add_node("critic_agent", critic_agent)
graph.add_conditional_edges("supervisor", supervisor_router)
# All agents return to supervisor (shared state)
graph.add_edge("research_agent", "supervisor")
graph.add_edge("coder_agent", "supervisor")
graph.add_edge("critic_agent", "supervisor")
The shared state allows agents to collaborate effectively by reading and updating common information.
State in Subgraphs
Subgraphs can have their own state schema, but they often share or map to the parent state.
# Subgraph with its own focused state
class ResearchState(TypedDict):
messages: Annotated[list, add_messages]
query: str
research_results: list[dict]
def create_research_subgraph():
subgraph = StateGraph(ResearchState)
subgraph.add_node("planner", planner_node)
subgraph.add_node("researcher", researcher_node)
subgraph.add_edge(START, "planner")
subgraph.add_edge("planner", "researcher")
subgraph.add_edge("researcher", END)
return subgraph.compile()
# Main graph
class MainState(MessagesState):
documents: list[dict] = []
research_complete: bool = False
main_graph = StateGraph(MainState)
main_graph.add_node("research_team", create_research_subgraph())
Passing State Between Subgraphs
There are two main ways to pass state:
1. Automatic State Mapping (Recommended)
from langgraph.graph import StateGraph
class MainState(MessagesState):
documents: list[dict] = []
research_summary: str | None = None
# Subgraph input/output mapping
research_subgraph = create_research_subgraph().with_config(
# Map main state to subgraph state
input_mapping={"messages": "messages", "documents": "research_results"},
output_mapping={"research_results": "documents"}
)
main_graph.add_node("research_team", research_subgraph)
2. Manual State Transformation
def research_entry_point(state: MainState):
return {
"messages": state["messages"],
"query": state["messages"][-1].content
}
def research_exit_point(state: ResearchState) -> MainState:
return {
"documents": state["research_results"],
"research_summary": "Research completed"
}
Performance Considerations
Large or frequently updated state can impact performance:
- Deep copying happens on every node execution
- Large message histories increase token usage and latency
- Checkpointing large states increases storage cost
- Keep state minimal
- Summarize old messages regularly
- Use trim_messages or summarization nodes in loops
Large State Management
When your state becomes large:
class OptimizedState(MessagesState):
messages: Annotated[list, add_messages]
documents: list[dict] = Field(default_factory=list)
# Instead of storing full results, store references
document_ids: list[str] = Field(default_factory=list)
summary: str | None = None # Keep summarized version
Strategies:
- Store references (IDs) instead of full objects
- Use vector store + retriever instead of dumping everything into state
- Periodically summarize conversation history
State Optimization Strategies
Best Practices:
- Use MessagesState as base when possible
- Separate concerns (conversation vs knowledge vs control)
- Summarize aggressively in long-running graphs
- Use Pydantic for better defaults and validation
- Avoid storing large binary data in state
class AgentState(MessagesState):
messages: Annotated[list, add_messages]
# Knowledge
document_ids: list[str] = Field(default_factory=list)
current_context_summary: str | None = None
# Control
iterations: int = Field(default=0)
max_iterations: int = Field(default=15)
# Output
final_answer: str | None = Field(default=None)
Well-designed state = predictable, efficient, and maintainable agents.
State is not just data storage, it is the architecture of your LangGraph application.
Common State Patterns
Here are the most effective and widely used patterns for designing state in LangGraph:
1. Metadata Pattern
Used to store control information, flags, and auxiliary data separate from core conversation.
class AgentState(MessagesState):
messages: Annotated[list, add_messages]
# Metadata fields
metadata: dict = Field(default_factory=dict)
iterations: int = Field(default=0)
confidence: float = Field(default=0.0)
task_status: Literal["running", "complete", "failed"] = "running"
last_tool_used: str | None = None
Usage:
def agent_node(state: AgentState):
return {
"messages": [response],
"metadata": {**state.get("metadata", {}), "last_updated": "now"},
"iterations": state.get("iterations", 0) + 1
}
2. Scratchpad Pattern
A temporary workspace for intermediate thoughts, plans, or calculations.
class AgentState(MessagesState):
messages: Annotated[list, add_messages]
scratchpad: str = Field(default="") # Agent's private notes
plan: list[str] = Field(default_factory=list) # Step-by-step plan
working_memory: dict = Field(default_factory=dict)
Example Usage:
def planner_node(state: AgentState):
plan = llm.invoke(f"Create a plan for: {state['messages'][-1].content}")
return {
"scratchpad": plan.content,
"plan": plan.content.split("\n")
}
This pattern helps agents maintain internal reasoning separate from final output.
3. Tool Result Pattern
Structured way to store and manage tool execution results.
class ToolResult(BaseModel):
tool_name: str
input: dict
output: str
success: bool
timestamp: str
class AgentState(MessagesState):
messages: Annotated[list, add_messages]
tool_results: list[ToolResult] = Field(default_factory=list)
tool_history: Annotated[list, add] = Field(default_factory=list)
Usage:
def tools_node(state: AgentState):
results = execute_tools(...)
tool_result = ToolResult(...)
return {
"tool_results": [tool_result],
"tool_history": [tool_result]
}
4. Memory-Augmented State
Combining short-term (messages) and long-term memory.
class AgentState(MessagesState):
messages: Annotated[list, add_messages]
# Long-term memory
long_term_memory: list[dict] = Field(default_factory=list)
user_profile: dict = Field(default_factory=dict)
# Summarized history
conversation_summary: str | None = Field(default=None)
This pattern is essential for long-running agents.
Common Mistakes with State
1. Overloading State
Putting too many unrelated things into state.
Bad Example:
class BadState(TypedDict):
data: dict # Everything dumped here
temp: Any
cache: dict
everything: list
Problem: Hard to understand, debug, and maintain.
2. Poor State Naming
Using vague names like
data
,
result
,
info
,
stuff
.
Bad:
result: list
info: dict
temp_data: Any
Good:
retrieved_documents: list[dict]
current_plan: list[str]
user_preferences: dict
3. Mutable State Confusion
Trying to mutate state directly inside nodes.
# Wrong
def bad_node(state):
state["messages"].append(new_msg) # Don't do this!
return state
Correct:
def good_node(state):
return {"messages": [new_msg]} # Let reducer handle it
4. Storing Too Much Data
Storing large objects, full documents, or binary data directly in state.
Bad:
full_documents: list[Document] # Very large
raw_api_responses: list[dict]
Better: Store references or summaries instead.
5. Unstructured State Design
No clear organization or separation of concerns.
Bad:
class MessyState(TypedDict):
x: Any
y: Any
z: Any
Good:
class CleanState(MessagesState):
messages: Annotated[list, add_messages]
knowledge: list[dict]
control: dict
output: dict
Best Practices for State Design
1. Keep State Minimal
Only include fields that are actively used during graph execution.
# Good
class AgentState(MessagesState):
messages: Annotated[list, add_messages]
documents: list[dict] = Field(default_factory=list)
iterations: int = Field(default=0)
Avoid storing full database results or large raw data.
2. Use Typed State
Always define your state with types.
Preferred (Pydantic):
class AgentState(BaseModel):
messages: Annotated[list, add_messages] = Field(default_factory=list)
documents: list[dict] = Field(default_factory=list)
iterations: int = Field(default=0)
3. Separate Messages from Metadata
class AgentState(MessagesState):
# Conversation
messages: Annotated[list, add_messages]
# Metadata / Control
metadata: dict = Field(default_factory=dict)
control: dict = Field(default_factory=dict)
# Knowledge
documents: list[dict] = Field(default_factory=list)
This separation improves clarity and maintainability.
4. Design Clear State Flows
Document which nodes read/write which fields.
class AgentState(MessagesState):
messages: Annotated[list, add_messages] # Written by: agent, tools
documents: list[dict] = Field(default_factory=list) # Written by: retriever
iterations: int = Field(default=0) # Written by: agent
final_answer: str | None = Field(default=None) # Written by: final node
5. Use Reducers Carefully
Only use custom reducers when necessary.
# Good - Simple and clear
messages: Annotated[list, add_messages]
documents: Annotated[list[dict], lambda a, b: a + b]
# Avoid over-engineering
# custom_reducer_that_does_too_much
Final Recommendation:
State should be like a well-organized desk — everything has its place, and only necessary items are kept on it.
A clean, well-designed state makes your entire LangGraph application more reliable, debuggable, and scalable.
AI agent LangChain LangGraph Python