Skip to main content

概述

内存是一个记录先前交互信息的系统。对于智能体来说,内存至关重要,因为它让它们能够记住先前的交互,从反馈中学习,并适应用户偏好。随着智能体处理更多复杂任务并涉及众多用户交互,这种能力对于效率和用户满意度都变得至关重要。 短期记忆使您的应用程序能够记住单个线程或对话中的先前交互。
一个线程在一个会话中组织多个交互,类似于电子邮件在单次对话中分组消息的方式。
对话历史是短期记忆中最常见的形式。长对话对今天的LLM(大型语言模型)构成了挑战;完整的历史可能无法适应LLM的上下文窗口,从而导致上下文丢失或错误。 即使您的模型支持完整的上下文长度,大多数大型语言模型在处理长上下文时仍然表现不佳。它们会被过时或与主题无关的内容“分散注意力”,同时响应时间变慢,成本增加。 聊天模型通过 消息 接收上下文,这些消息包括指令(系统消息)和输入(人类消息)。在聊天应用中,消息在人类输入和模型响应之间交替,随着时间的推移,消息列表会不断变长。由于上下文窗口有限,许多应用可以从使用删除或“忘记”过时信息的技巧中受益。

使用方法

为了给智能体添加短期记忆(线程级别持久化),您需要在创建智能体时指定一个 checkpointer
LangChain的智能体将短期记忆作为您智能体状态的一部分进行管理。通过将这些信息存储在图的状 态中,智能体可以访问给定对话的完整上下文,同时保持不同线程之间的分离。状态通过检查点持久化到数据库(或内存)中,以便线程可以在任何时间恢复。短期记忆更新发生在智能体被调用或完成一个步骤(如工具调用)时,以及每个步骤开始时读取状态。
from langchain.agents import create_agent
from langgraph.checkpoint.memory import InMemorySaver  


agent = create_agent(
    "openai:gpt-5",
    [get_user_info],
    checkpointer=InMemorySaver(),  
)

agent.invoke(
    {"messages": [{"role": "user", "content": "Hi! My name is Bob."}]},
    {"configurable": {"thread_id": "1"}},  
)

在生产中

在生产环境中,使用由数据库支持的检查点器:
pip install langgraph-checkpoint-postgres
from langchain.agents import create_agent

from langgraph.checkpoint.postgres import PostgresSaver  


DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres?sslmode=disable"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    checkpointer.setup() # auto create tables in PostgresSql
    agent = create_agent(
        "openai:gpt-5",
        [get_user_info],
        checkpointer=checkpointer,  
    )

自定义智能体记忆

默认情况下,智能体使用 AgentState 来管理短期记忆,特别是通过一个 messages 键来管理对话历史。 您可以扩展 AgentState 以添加额外的字段。自定义状态模式通过 create_agent 使用 state_schema 参数传递。
from langchain.agents import create_agent, AgentState
from langgraph.checkpoint.memory import InMemorySaver


class CustomAgentState(AgentState):  
    user_id: str
    preferences: dict

agent = create_agent(
    "openai:gpt-5",
    [get_user_info],
    state_schema=CustomAgentState,  
    checkpointer=InMemorySaver(),
)

# Custom state can be passed in invoke
result = agent.invoke(
    {
        "messages": [{"role": "user", "content": "Hello"}],
        "user_id": "user_123",  
        "preferences": {"theme": "dark"}  
    },
    {"configurable": {"thread_id": "1"}})

常见模式

启用短期记忆后,长对话可能会超出LLM的上下文窗口。常见解决方案包括:

Trim messages

在调用LLM之前删除前N条消息

Delete messages

永久删除LangGraph状态中的消息

Summarize messages

概括历史中的早期消息并用摘要替换

Custom strategies

自定义策略(例如,消息过滤等)
这允许智能体在不超出LLM的上下文窗口的情况下跟踪对话。

去除消息

大多数大型语言模型都支持的最大上下文窗口(以令牌计)。 一种决定何时截断消息的方法是计算消息历史中的标记数,并在接近该限制时进行截断。如果您使用LangChain,可以使用截断消息实用程序,并指定从列表中保留的标记数,以及用于处理边界的 strategy(例如,保留最后的 max_tokens)。 要在智能体中裁剪消息历史,请使用@before_model中间件装饰器:
from langchain.messages import RemoveMessage
from langgraph.graph.message import REMOVE_ALL_MESSAGES
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import before_model
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig
from typing import Any


@before_model
def trim_messages(state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
    """Keep only the last few messages to fit context window."""
    messages = state["messages"]

    if len(messages) <= 3:
        return None  # No changes needed

    first_msg = messages[0]
    recent_messages = messages[-3:] if len(messages) % 2 == 0 else messages[-4:]
    new_messages = [first_msg] + recent_messages

    return {
        "messages": [
            RemoveMessage(id=REMOVE_ALL_MESSAGES),
            *new_messages
        ]
    }

agent = create_agent(
    model,
    tools=tools,
    middleware=[trim_messages],
    checkpointer=InMemorySaver(),
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}

agent.invoke({"messages": "hi, my name is bob"}, config)
agent.invoke({"messages": "write a short poem about cats"}, config)
agent.invoke({"messages": "now do the same but for dogs"}, config)
final_response = agent.invoke({"messages": "what's my name?"}, config)

final_response["messages"][-1].pretty_print()
"""
================================== Ai Message ==================================

Your name is Bob. You told me that earlier.
If you'd like me to call you a nickname or use a different name, just say the word.
"""

删除消息

您可以删除图状态中的消息以管理消息历史记录。 这在您想要删除特定消息或清除整个消息历史记录时很有用。 要从图状态中删除消息,您可以使用 RemoveMessage 为了使 RemoveMessage 正常工作,您需要使用具有 add_messages reducer 的状态键。 默认的 AgentState 提供了这一点。 要删除特定消息:
from langchain.messages import RemoveMessage  

def delete_messages(state):
    messages = state["messages"]
    if len(messages) > 2:
        # remove the earliest two messages
        return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}  
要删除 所有 消息:
from langgraph.graph.message import REMOVE_ALL_MESSAGES

def delete_messages(state):
    return {"messages": [RemoveMessage(id=REMOVE_ALL_MESSAGES)]}  
在删除消息时,请确保生成的消息历史记录是有效的。检查您所使用的LLM提供商的限制。例如:
  • 一些提供商期望消息历史从一条 user 消息开始
  • 大多数提供商要求带有工具调用的 assistant 消息后面必须跟随相应的 tool 结果消息。
from langchain.messages import RemoveMessage
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import after_model
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.runtime import Runtime
from langchain_core.runnables import RunnableConfig


@after_model
def delete_old_messages(state: AgentState, runtime: Runtime) -> dict | None:
    """Remove old messages to keep conversation manageable."""
    messages = state["messages"]
    if len(messages) > 2:
        # remove the earliest two messages
        return {"messages": [RemoveMessage(id=m.id) for m in messages[:2]]}
    return None


agent = create_agent(
    "openai:gpt-5-nano",
    tools=[],
    system_prompt="Please be concise and to the point.",
    middleware=[delete_old_messages],
    checkpointer=InMemorySaver(),
)

config: RunnableConfig = {"configurable": {"thread_id": "1"}}

for event in agent.stream(
    {"messages": [{"role": "user", "content": "hi! I'm bob"}]},
    config,
    stream_mode="values",
):
    print([(message.type, message.content) for message in event["messages"]])

for event in agent.stream(
    {"messages": [{"role": "user", "content": "what's my name?"}]},
    config,
    stream_mode="values",
):
    print([(message.type, message.content) for message in event["messages"]])
[('human', "hi! I'm bob")]
[('human', "hi! I'm bob"), ('ai', 'Hi Bob! Nice to meet you. How can I help you today? I can answer questions, brainstorm ideas, draft text, explain things, or help with code.')]
[('human', "hi! I'm bob"), ('ai', 'Hi Bob! Nice to meet you. How can I help you today? I can answer questions, brainstorm ideas, draft text, explain things, or help with code.'), ('human', "what's my name?")]
[('human', "hi! I'm bob"), ('ai', 'Hi Bob! Nice to meet you. How can I help you today? I can answer questions, brainstorm ideas, draft text, explain things, or help with code.'), ('human', "what's my name?"), ('ai', 'Your name is Bob. How can I help you today, Bob?')]
[('human', "what's my name?"), ('ai', 'Your name is Bob. How can I help you today, Bob?')]

消息摘要

上述示例中,对消息进行裁剪或删除的问题在于,你可能会从消息队列的筛选中丢失信息。 正因为如此,一些应用程序从使用聊天模型对消息历史进行更复杂的总结方法中受益。 为了在智能体中总结消息历史,请使用内置的SummarizationMiddleware: CODE_BLOCK_9 See SummarizationMiddleware for more configuration options.

Access memory

You can access and modify the short-term memory (state) of an agent in several ways:

Tools

Read short-term memory in a tool

Access short term memory (state) in a tool using the ToolRuntime parameter. The tool_runtime parameter is hidden from the tool signature (so the model doesn’t see it), but the tool can access the state through it. CODE_BLOCK_10

Write short-term memory from tools

To modify the agent’s short-term memory (state) during execution, you can return state updates directly from the tools. This is useful for persisting intermediate results or making information accessible to subsequent tools or prompts. CODE_BLOCK_11

Prompt

Access short term memory (state) in middleware to create dynamic prompts based on conversation history or custom state fields. CODE_BLOCK_12 CODE_BLOCK_13

Before model

Access short term memory (state) in @before_model middleware to process messages before model calls. CODE_BLOCK_14 CODE_BLOCK_15

After model

Access short term memory (state) in @after_model 中间件在模型调用后处理消息。
from langchain.messages import RemoveMessage
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents import create_agent, AgentState
from langchain.agents.middleware import after_model
from langgraph.runtime import Runtime


@after_model
def validate_response(state: AgentState, runtime: Runtime) -> dict | None:
    """Remove messages containing sensitive words."""
    STOP_WORDS = ["password", "secret"]
    last_message = state["messages"][-1]
    if any(word in last_message.content for word in STOP_WORDS):
        return {"messages": [RemoveMessage(id=last_message.id)]}
    return None

agent = create_agent(
    model="openai:gpt-5-nano",
    tools=[],
    middleware=[validate_response],
    checkpointer=InMemorySaver(),
)