LangGraph Agents · Lesson 2 of 17
Nodes, Edges, and State: The Core Concepts
Graphs, Nodes, and Edges
LangGraph models agent execution as a directed graph. Each step in the agent's reasoning is a node. Each possible transition between steps is an edge. The framework executes the graph by walking from the entry node to the exit node, passing state through every step along the way.
This mental model — nodes as functions, edges as transitions — is what makes LangGraph agents debuggable, resumable, and composable in ways that plain Python control flow is not.
The Directed Graph
A directed graph is a collection of nodes connected by directed edges. "Directed" means each edge has a source and a destination — execution flows in one direction along each edge.
In LangGraph:
- Nodes are Python functions that do work (call an LLM, query a database, run a tool)
- Edges are the connections that determine which node runs next
- State is a shared data structure that every node reads from and writes to
- START is the implicit source node — execution begins here
- END is the implicit sink node — execution stops here
A minimal three-node graph looks like this conceptually:
START → retrieve → generate → format → ENDExecution begins at START, passes through each node in sequence, and terminates at END.
Nodes
A node is a plain Python function. It takes the current state as its only argument and returns a dictionary containing only the keys it wants to update.
from typing import TypedDict
class PipelineState(TypedDict):
query: str
retrieved_docs: list[str]
answer: str
formatted_output: str
def retrieve_node(state: PipelineState) -> dict:
"""Retrieve relevant documents for the query."""
query = state["query"]
# Simulate retrieval
docs = [
f"Document about {query}: content here...",
f"Another document about {query}: more content...",
]
return {"retrieved_docs": docs}
def generate_node(state: PipelineState) -> dict:
"""Generate an answer using retrieved documents."""
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
context = "\n".join(state["retrieved_docs"])
prompt = f"Context:\n{context}\n\nQuestion: {state['query']}\n\nAnswer:"
response = llm.invoke(prompt)
return {"answer": response.content}
def format_node(state: PipelineState) -> dict:
"""Format the answer for display."""
answer = state["answer"]
formatted = f"**Answer:**\n\n{answer}\n\n---\n*Sources: {len(state['retrieved_docs'])} documents retrieved*"
return {"formatted_output": formatted}Key rules for node functions:
- Single argument: always
state: YourStateType - Return a dict: only include keys you are updating — missing keys are left unchanged
- No side effects on state: never mutate
statein place — always return a new dict - Any Python logic allowed: call APIs, query databases, run tools, call other functions
Edges
Edges connect nodes. There are two kinds:
Unconditional Edges
An unconditional edge always goes from node A to node B, no matter what state contains.
from langgraph.graph import StateGraph, START, END
builder = StateGraph(PipelineState)
builder.add_node("retrieve", retrieve_node)
builder.add_node("generate", generate_node)
builder.add_node("format", format_node)
# Unconditional edges — always follow this path
builder.add_edge(START, "retrieve")
builder.add_edge("retrieve", "generate")
builder.add_edge("generate", "format")
builder.add_edge("format", END)
app = builder.compile()Conditional Edges
A conditional edge inspects state and returns the name of the next node to visit.
def route_after_generate(state: PipelineState) -> str:
"""Decide what to do after generation."""
answer = state.get("answer", "")
if len(answer) < 50:
# Answer is too short — retrieve more documents
return "retrieve"
elif "I don't know" in answer.lower():
# LLM expressed uncertainty — try again
return "generate"
else:
# Good answer — format and finish
return "format"
builder.add_conditional_edges(
"generate", # from this node
route_after_generate, # call this function to decide
{
"retrieve": "retrieve", # if function returns "retrieve", go there
"generate": "generate", # if "generate", loop back
"format": "format", # if "format", proceed
}
)The mapping dict {"key": "node_name"} translates the router function's return value to a node name. If the keys and node names are identical (as above), you can omit the dict and LangGraph uses the return value directly as the node name.
Entry Point and Finish Point
Every graph needs exactly one entry point and at least one finish point.
Entry point — where execution begins. Use add_edge(START, first_node) or the older set_entry_point(first_node) method:
from langgraph.graph import START, END
# Modern style (preferred)
builder.add_edge(START, "retrieve")
# Older style (still works)
builder.set_entry_point("retrieve")Finish point — where execution terminates. Use add_edge(last_node, END). You can have multiple finish points by routing from different nodes to END.
# Single finish point
builder.add_edge("format", END)
# Multiple finish points
def route_after_check(state: PipelineState) -> str:
if state.get("error"):
return "error_handler"
return END
builder.add_conditional_edges("check", route_after_check, {END: END, "error_handler": "error_handler"})
builder.add_edge("error_handler", END)A Complete Three-Node Graph
Here is a self-contained example that retrieves documents, generates an answer, and formats the output:
import os
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
os.environ["OPENAI_API_KEY"] = "sk-..."
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
class ResearchState(TypedDict):
query: str
retrieved_docs: list[str]
answer: str
formatted_output: str
# ── Node definitions ──────────────────────────────────────────────────────────
def retrieve(state: ResearchState) -> dict:
"""Simulate document retrieval."""
q = state["query"]
docs = [
f"Overview of {q}: This topic involves several key components...",
f"Technical details of {q}: The implementation typically uses...",
f"Use cases for {q}: Common applications include...",
]
print(f"[retrieve] Found {len(docs)} documents for: {q}")
return {"retrieved_docs": docs}
def generate(state: ResearchState) -> dict:
"""Generate an answer from retrieved documents."""
context = "\n\n".join(state["retrieved_docs"])
prompt = (
f"You are a helpful assistant. Use the following context to answer the question.\n\n"
f"Context:\n{context}\n\n"
f"Question: {state['query']}\n\n"
f"Answer:"
)
response = llm.invoke(prompt)
print(f"[generate] Generated {len(response.content)} character answer")
return {"answer": response.content}
def format_output(state: ResearchState) -> dict:
"""Format the final output."""
num_docs = len(state["retrieved_docs"])
output = (
f"# Research Answer\n\n"
f"**Query:** {state['query']}\n\n"
f"**Answer:**\n\n{state['answer']}\n\n"
f"---\n"
f"*Based on {num_docs} retrieved documents*"
)
print(f"[format] Output formatted ({len(output)} chars)")
return {"formatted_output": output}
# ── Graph construction ────────────────────────────────────────────────────────
builder = StateGraph(ResearchState)
builder.add_node("retrieve", retrieve)
builder.add_node("generate", generate)
builder.add_node("format_output", format_output)
builder.add_edge(START, "retrieve")
builder.add_edge("retrieve", "generate")
builder.add_edge("generate", "format_output")
builder.add_edge("format_output", END)
app = builder.compile()
# ── Invocation ────────────────────────────────────────────────────────────────
result = app.invoke({"query": "What is vector database indexing?"})
print("\n" + result["formatted_output"])Visualizing the Graph
LangGraph can render your graph as a Mermaid diagram for documentation or debugging:
# Get the Mermaid diagram source
print(app.get_graph().draw_mermaid())Output (simplified):
graph TD
START --> retrieve
retrieve --> generate
generate --> format_output
format_output --> ENDYou can also save it as an image (requires pip install pygraphviz or use the PNG method):
# Save as PNG (requires graphviz)
png_data = app.get_graph().draw_mermaid_png()
with open("graph.png", "wb") as f:
f.write(png_data)Parallel Node Execution (Fan-out / Fan-in)
LangGraph supports running multiple nodes in parallel by adding edges from one node to several nodes simultaneously. The graph waits for all parallel branches to complete before proceeding.
class ParallelState(TypedDict):
query: str
web_results: list[str]
db_results: list[str]
combined_answer: str
def web_search(state: ParallelState) -> dict:
# Simulate web search
return {"web_results": [f"Web result for {state['query']}"]}
def db_lookup(state: ParallelState) -> dict:
# Simulate database lookup
return {"db_results": [f"DB record for {state['query']}"]}
def combine(state: ParallelState) -> dict:
all_results = state["web_results"] + state["db_results"]
answer = f"Combined {len(all_results)} results: " + " | ".join(all_results)
return {"combined_answer": answer}
builder = StateGraph(ParallelState)
builder.add_node("web_search", web_search)
builder.add_node("db_lookup", db_lookup)
builder.add_node("combine", combine)
# Fan-out: START goes to both parallel nodes
builder.add_edge(START, "web_search")
builder.add_edge(START, "db_lookup")
# Fan-in: both parallel nodes converge to combine
builder.add_edge("web_search", "combine")
builder.add_edge("db_lookup", "combine")
builder.add_edge("combine", END)
app = builder.compile()
result = app.invoke({"query": "LangGraph tutorial"})
print(result["combined_answer"])Summary
| Concept | What it is | LangGraph API |
|---|---|---|
| Graph | Directed graph of functions | StateGraph(State) |
| Node | Python function doing work | add_node(name, fn) |
| Unconditional edge | Always go A → B | add_edge(A, B) |
| Conditional edge | Go to A, B, or C based on state | add_conditional_edges(from, fn, map) |
| Entry point | Where execution starts | add_edge(START, node) |
| Finish point | Where execution ends | add_edge(node, END) |
| Compiled graph | Runnable Pregel graph | builder.compile() |
The graph model gives you explicit, inspectable control flow. Every possible path through your agent is visible in the graph structure — which makes debugging, testing, and reasoning about agent behavior dramatically easier than implicit control flow in a Python function.