Customizing Memory in LangGraph Agents for Better Conversations

Right now everyone is building conversational agents and having them remember past interactions is crucial for creating natural, engaging user experiences. LangChain, a powerful framework for developing LLM-based applications, has evolved its memory management. Recently, in v0.3.x they deprecated indivdual memory management classes, the recommended approach for memory in agents is to use LangGraph persistence. This tutorial dives into customizing memory using LangGraph, addressing common challenges like maintaining persistent chat history and optimizing for better conversations. Whether you're building chatbots or intelligent assistants, mastering LangGraph memory will enhance your agent's intelligence and make the UX feel more seamless across interactions.

Overview of Memory Types in LangChain

As of LangChain v0.3.1, several legacy memory types have been deprecated in favor of more robust persistence via LangGraph. Here's a quick overview of the deprecated types and the migration path:

ConversationBufferMemory: Deprecated. Previously stored entire conversation history.
ConversationBufferWindowMemory: Deprecated. Limited to recent messages.
ConversationSummaryMemory: Deprecated. Summarized interactions.
ConversationEntityMemory: Deprecated. Extracted and stored entities.

All of these have been replaced by LangGraph's checkpointing system, which provides built-in persistence, support for multiple threads, and advanced features like time travel. LangGraph uses checkpointers (e.g., InMemorySaver for in-memory, SqliteSaver for persistent storage) to manage state across conversations.

For more on the migration, refer to the official migration guide.

In this langchain memory tutorial, we'll start with simple setups using LangGraph and progress to custom persistent implementations.

Setting Up Simple Memory Buffers

Let's begin by setting up a basic conversational agent with memory using LangGraph and InMemorySaver. This provides simple, in-memory persistence across interactions within the same thread.

First, ensure you have the necessary dependencies installed. We'll use LangChain version 0.3.26 (the latest as of July 2025), LangGraph, and OpenAI for the LLM.

uv add langchain langchain-openai langgraph langchain-community --bounds lower

Now, here's a Python example to create an agent with simple memory:

    from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver

# Initialize LLM (replace with your API key)
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Define a simple tool (e.g., for math calculations)
@tool
def multiply(a: int, b: int) -> int:
   """Multiplies two numbers."""
   return a * b

tools = [multiply]

# Set up memory with MemorySaver
memory = MemorySaver()

# Create the agent
agent_executor = create_react_agent(llm, tools, checkpointer=memory)

# Interact with the agent using a thread_id for memory
config = {"configurable": {"thread_id": "chat1"}}

response1 = agent_executor.invoke({"messages": [{"role": "user", "content": "What is 3 times 4?"}]}, config)
print(response1["messages"][-1].content)  # Output: 12

response2 = agent_executor.invoke({"messages": [{"role": "user", "content": "What was the result of the previous calculation?"}]}, config)
print(response2["messages"][-1].content)  # Output: The previous calculation result

This setup uses MemorySaver to store and recall chat history within the same thread, making your langchain agents more context-aware.

Custom Memory Implementation

For more advanced scenarios, such as persistent storage across restarts, use a persistent checkpointer like SqliteSaver. First, install the required package:

uv add langgraph-checkpoint-sqlite --bounds lower

Here's how to create a persistent memory using SqliteSaver, ensuring conversations survive restarts—perfect for production chat apps.

    import sqlite3
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.sqlite import SqliteSaver

# Initialize LLM (replace with your API key)
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Define a simple tool
@tool
def multiply(a: int, b: int) -> int:
   """Multiplies two numbers."""
   return a * b

tools = [multiply]

# Set up persistent memory with SqliteSaver
conn = sqlite3.connect("checkpoints.db", check_same_thread=False)
memory = SqliteSaver(conn)

# Create the agent
agent_executor = create_react_agent(llm, tools, checkpointer=memory)

# Interact with the agent using a thread_id for memory
config = {"configurable": {"thread_id": "chat1"}}

response1 = agent_executor.invoke({"messages": [{"role": "user", "content": "What is 3 times 4?"}]}, config)
print(response1["messages"][-1].content)  # Output: 12

response2 = agent_executor.invoke({"messages": [{"role": "user", "content": "What was the result of the previous calculation?"}]}, config)
print(response2["messages"][-1].content) # Output: The previous calculation result

This implementation uses SqliteSaver for file-based persistence (in "checkpoints.db"), providing true long-term conversation memory in LangGraph.

For fully custom behavior, you can subclass BaseCheckpointSaver to create your own checkpointer, tailoring persistence (e.g., to JSON files or other databases).

Example: Persistent Chat History

Building on the persistent memory above, let's apply it to a real-world example: a persistent chatbot for customer support. The agent remembers user details across sessions and restarts, improving personalization.

In the code snippet provided, after invoking with "Hi, I'm Bob.", stopping and restarting the script, then asking "Who am I?" should recall the name due to the SQLite storage. This addresses pain points in custom memory LangChain agents, ensuring seamless experiences across many sessions.

Optimization Techniques

To make your memory-efficient in LangGraph:

Limit History Length: Use message trimming functions like trim_messages to avoid token limits.
Summarization: Implement summary nodes in your graph to condense long histories.
Entity Extraction: Add tools for entity extraction and store in a separate memory store for focused recall.
Async Persistence: Use async checkpointers like AsyncSqliteSaver for high-traffic apps to prevent bottlenecks.
Monitoring: Leverage LangSmith to trace memory usage and optimize graphs.

These techniques optimize LLM agents memory, preventing issues like context overflow in extended conversations.

Conclusion: Enhancing Agent Intelligence

Customizing memory with LangGraph transforms basic chatbots into intelligent, context-aware systems. By leveraging checkpointers for persistence, you can build robust applications that remember and adapt across sessions. This not only improves user engagement but also positions your projects for scalability.

Ready to implement advanced conversation memory in LangGraph? If you need help with custom LangChain agents or optimizations, contact us at Focused.io, we're here to help you build agents that work!

Back to Explore Focused Lab