中间件

中间件提供了一种更紧密控制智能体内部发生事情的方法。核心智能体循环涉及调用模型，让它选择执行工具，然后在其不再调用更多工具时结束。

中间件在每个步骤之前和之后都暴露了钩子：

中间件能做什么？

Monitor

使用日志记录、分析和调试跟踪智能体行为

Modify

转换提示、工具选择和输出格式化

Control

添加重试、回退和提前终止逻辑

Enforce

应用速率限制、安全护栏和PII检测

通过传递给 @[create_agent] 添加中间件：

import {
  createAgent,
  summarizationMiddleware,
  humanInTheLoopMiddleware,
} from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [...],
  middleware: [summarizationMiddleware, humanInTheLoopMiddleware],
});

内置中间件

LangChain 为常见用例提供预构建的中间件：

摘要

当接近令牌限制时，自动总结对话历史。

适用场景：

超出上下文窗口的长对话
历史记录丰富的多轮对话
需要保留完整对话上下文的应用程序

import { createAgent, summarizationMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [weatherTool, calculatorTool],
  middleware: [
    summarizationMiddleware({
      model: "openai:gpt-4o-mini",
      maxTokensBeforeSummary: 4000, // Trigger summarization at 4000 tokens
      messagesToKeep: 20, // Keep last 20 messages after summary
      summaryPrompt: "Custom prompt for summarization...", // Optional
    }),
  ],
});

Configuration options

model

string

required

生成摘要的模型

maxTokensBeforeSummary

number

触发摘要的令牌阈值

messagesToKeep

number

default:"20"

保留最近的消息

tokenCounter

function

自定义令牌计数函数。默认为基于字符的计数。

summaryPrompt

string

自定义提示模板。如未指定，则使用内置模板。

summaryPrefix

string

default:"## Previous conversation summary:"

摘要消息的前缀

人工增强循环

在智能体执行工具调用之前，暂停智能体执行以供人工审批、编辑或拒绝。

适用场景：

需要人工审批的高风险操作（数据库写入、金融交易）
需要人工监管的合规工作流程
长时间进行的对话，其中使用人工反馈来指导智能体

import { createAgent, humanInTheLoopMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [readEmailTool, sendEmailTool],
  middleware: [
    humanInTheLoopMiddleware({
      interruptOn: {
        // Require approval, editing, or rejection for sending emails
        send_email: {
          allowAccept: true,
          allowEdit: true,
          allowRespond: true,
        },
        // Auto-approve reading emails
        read_email: false,
      }
    })
  ]
});

Configuration options

interruptOn

object

required

工具名称到审批配置的映射

工具批准配置选项：

allowAccept

boolean

default:"false"

是否允许批准

allowEdit

boolean

default:"false"

是否允许编辑

allowRespond

boolean

default:"false"

是否允许响应/拒绝

重要： 人机交互中间件需要一个检查点来在中断期间维护状态。查看人机交互文档以获取完整示例和集成模式。

Anthropic 提示缓存

通过使用Anthropic模型缓存重复的提示前缀来降低成本。

适用场景：

需要长且重复的系统提示的应用程序
在调用之间重复使用相同上下文的智能体
降低高量部署的API成本

了解有关Anthropic Prompt Caching策略和限制的更多信息。

import { createAgent, HumanMessage, anthropicPromptCachingMiddleware } from "langchain";

const LONG_PROMPT = `
Please be a helpful assistant.

<Lots more context ...>
`;

const agent = createAgent({
  model: "anthropic:claude-sonnet-4-latest",
  prompt: LONG_PROMPT,
  middleware: [anthropicPromptCachingMiddleware({ ttl: "5m" })],
});

// cache store
await agent.invoke({
  messages: [new HumanMessage("Hi, my name is Bob")]
});

// cache hit, system prompt is cached
const result = await agent.invoke({
  messages: [new HumanMessage("What's my name?")]
});

Configuration options

ttl

string

default:"5m"

缓存内容的存活时间。有效值："5m" 或 "1h"

模型调用限制

限制模型调用的次数，以防止无限循环或过高的成本。

适用场景：

防止智能体进行过多的API调用
在生产部署中实施成本控制
在特定的调用预算内测试智能体行为

import { createAgent, modelCallLimitMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [...],
  middleware: [
    modelCallLimitMiddleware({
      threadLimit: 10, // Max 10 calls per thread (across runs)
      runLimit: 5, // Max 5 calls per run (single invocation)
      exitBehavior: "end", // Or "error" to throw exception
    }),
  ],
});

Configuration options

threadLimit

number

线程中所有运行的最大模型调用次数。默认无限制。

runLimit

number

单次调用中模型调用的最大次数。默认无限制。

exitBehavior

string

default:"end"

达到限制时的行为。选项："end"（优雅终止）或"error"（抛出异常）

工具调用限制

限制对特定工具或所有工具的调用次数。

适用场景：

防止过度调用昂贵的第三方API
限制网页搜索或数据库查询
对特定工具的使用实施速率限制

import { createAgent, toolCallLimitMiddleware } from "langchain";

// Limit all tool calls
const globalLimiter = toolCallLimitMiddleware({ threadLimit: 20, runLimit: 10 });

// Limit specific tool
const searchLimiter = toolCallLimitMiddleware({
  toolName: "search",
  threadLimit: 5,
  runLimit: 3,
});

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [...],
  middleware: [globalLimiter, searchLimiter],
});

Configuration options

toolName

string

限制的特定工具。如果没有提供，限制将应用于所有工具。

threadLimit

number

线程中所有运行的最大工具调用次数。默认无限制。

runLimit

number

单次调用中工具调用的最大次数。默认无限制。

exitBehavior

string

default:"end"

达到限制时的行为。选项："end"（优雅终止）或"error"（抛出异常）

模型回退

当主模型失败时自动回退到备用模型。

适用场景：

构建能够处理模型故障的弹性智能体
通过回退到更便宜的模型进行成本优化
在 OpenAI、Anthropic 等提供商之间实现冗余

import { createAgent, modelFallbackMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o", // Primary model
  tools: [...],
  middleware: [
    modelFallbackMiddleware(
      "openai:gpt-4o-mini", // Try first on error
      "anthropic:claude-3-5-sonnet-20241022" // Then this
    ),
  ],
});

Configuration options

中间件接受一个可变数量的字符串参数，代表备选模型，并按顺序排列。

...models

string[]

required

当主模型失败时尝试的备用模型字符串之一或多个

modelFallbackMiddleware(
  "first-fallback-model",
  "second-fallback-model",
  // ... more models
)

个人信息识别

检测和处理对话中的个人可识别信息。

适用场景：

需要符合规定的医疗和金融应用
需要清理日志的客户服务智能体
处理敏感用户数据的任何应用

import { createAgent, piiRedactionMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [...],
  middleware: [
    // Redact emails in user input
    piiRedactionMiddleware({
      piiType: "email",
      strategy: "redact",
      applyToInput: true,
    }),
    // Mask credit cards (show last 4 digits)
    piiRedactionMiddleware({
      piiType: "credit_card",
      strategy: "mask",
      applyToInput: true,
    }),
    // Custom PII type with regex
    piiRedactionMiddleware({
      piiType: "api_key",
      detector: /sk-[a-zA-Z0-9]{32}/,
      strategy: "block", // Throw error if detected
    }),
  ],
});

Configuration options

piiType

string

required

要检测的PII类型。可以是内置类型（email、credit_card、ip、mac_address、url）或自定义类型名称。

strategy

string

default:"redact"

如何处理检测到的个人身份信息（PII）。选项：

"block" - 检测到时抛出错误
"redact" - 替换为 [REDACTED_TYPE]
"mask" - 部分掩码（例如，****-****-****-1234）
"hash" - 替换为确定性哈希

detector

RegExp

自定义检测正则表达式模式。如果未提供，则使用内置的 PII 类型检测器。

applyToInput

boolean

default:"true"

在模型调用前检查用户消息

applyToOutput

boolean

default:"false"

检查模型调用后的AI消息

applyToToolResults

boolean

default:"false"

执行后检查工具结果消息

规划

添加复杂多步骤任务的待办事项列表管理功能。

此中间件自动为智能体提供write_todos工具和系统提示，以指导有效的任务规划。

import { createAgent, HumanMessage, todoListMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [
    /* ... */
  ],
  middleware: [todoListMiddleware()] as const,
});

const result = await agent.invoke({
  messages: [new HumanMessage("Help me refactor my codebase")],
});
console.log(result.todos); // Array of todo items with status tracking

Configuration options

没有可用的配置选项（使用默认值）。

LLM 工具选择器

使用LLM智能选择相关工具，然后再调用主模型。

适用场景：

拥有众多工具（10+）的智能体，其中大部分工具在每个查询中并不相关
通过过滤不相关工具来减少令牌使用
提高模型专注度和准确性

import { createAgent, llmToolSelectorMiddleware } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [tool1, tool2, tool3, tool4, tool5, ...], // Many tools
  middleware: [
    llmToolSelectorMiddleware({
      model: "openai:gpt-4o-mini", // Use cheaper model for selection
      maxTools: 3, // Limit to 3 most relevant tools
      alwaysInclude: ["search"], // Always include certain tools
    }),
  ],
});

Configuration options

model

string

工具选择模型。默认为智能体的主模型。

maxTools

number

最大选择工具的数量。默认无限制。

alwaysInclude

string[]

总是包含在选择中的工具名称数组

上下文编辑

通过裁剪、总结或清除工具使用来管理对话上下文。

适用场景：

需要定期清理上下文的长时间对话
从上下文中移除失败的工具尝试
自定义上下文管理策略

import { createAgent, contextEditingMiddleware, ClearToolUsesEdit } from "langchain";

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [...],
  middleware: [
    contextEditingMiddleware({
      edits: [
        new ClearToolUsesEdit({ maxTokens: 1000 }), // Clear old tool uses
      ],
    }),
  ],
});

Configuration options

edits

ContextEdit[]

default:"[new ClearToolUsesEdit()]"

应用 ContextEdit 策略的数组

@[ClearToolUsesEdit] 选项：

maxTokens

number

default:"1000"

触发编辑的令牌计数

自定义中间件

通过实现运行在智能体执行流程特定点的钩子来构建自定义中间件。

基于类的中间件

两种钩子样式

Node-style hooks

在特定的执行点依次运行。用于日志记录、验证和状态更新。

Wrap-style hooks

使用完全控制处理程序调用的方式拦截执行。用于重试、缓存和转换。

节点式钩子

在执行流程的特定点运行：

beforeAgent - 在智能体启动之前（每次调用一次）
beforeModel - 在每次模型调用之前
afterModel - 在每次模型响应之后
afterAgent - 在智能体完成之后（每次调用最多一次）

示例：日志中间件

import { createMiddleware } from "langchain";

const loggingMiddleware = createMiddleware({
  name: "LoggingMiddleware",
  beforeModel: (state) => {
    console.log(`About to call model with ${state.messages.length} messages`);
    return;
  },
  afterModel: (state) => {
    const lastMessage = state.messages[state.messages.length - 1];
    console.log(`Model returned: ${lastMessage.content}`);
    return;
  },
});

示例：会话长度限制

import { createMiddleware, AIMessage } from "langchain";

const createMessageLimitMiddleware = (maxMessages: number = 50) => {
  return createMiddleware({
    name: "MessageLimitMiddleware",
    beforeModel: (state) => {
      if (state.messages.length === maxMessages) {
        return {
          messages: [new AIMessage("Conversation limit reached.")],
          jumpTo: "end",
        };
      }
      return;
    },
  });
};

包装式钩子

拦截处理程序被调用时的执行和控制：

wrapModelCall - 在每次模型调用周围
wrapToolCall - 在每次工具调用周围

您决定处理程序被调用零次（短路）、一次（正常流程）还是多次（重试逻辑）。 示例：模型重试中间件

import { createMiddleware } from "langchain";

const createRetryMiddleware = (maxRetries: number = 3) => {
  return createMiddleware({
    name: "RetryMiddleware",
    wrapModelCall: (request, handler) => {
      for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
          return handler(request);
        } catch (e) {
          if (attempt === maxRetries - 1) {
            throw e;
          }
          console.log(`Retry ${attempt + 1}/${maxRetries} after error: ${e}`);
        }
      }
      throw new Error("Unreachable");
    },
  });
};

示例：动态模型选择

import { createMiddleware, initChatModel } from "langchain";

const dynamicModelMiddleware = createMiddleware({
  name: "DynamicModelMiddleware",
  wrapModelCall: (request, handler) => {
    // Use different model based on conversation length
    const modifiedRequest = { ...request };
    if (request.messages.length > 10) {
      modifiedRequest.model = initChatModel("openai:gpt-4o");
    } else {
      modifiedRequest.model = initChatModel("openai:gpt-4o-mini");
    }
    return handler(modifiedRequest);
  },
});

示例：工具调用监控

import { createMiddleware } from "langchain";

const toolMonitoringMiddleware = createMiddleware({
  name: "ToolMonitoringMiddleware",
  wrapToolCall: (request, handler) => {
    console.log(`Executing tool: ${request.toolCall.name}`);
    console.log(`Arguments: ${JSON.stringify(request.toolCall.args)}`);

    try {
      const result = handler(request);
      console.log("Tool completed successfully");
      return result;
    } catch (e) {
      console.log(`Tool failed: ${e}`);
      throw e;
    }
  },
});

自定义状态模式

中间件可以通过自定义属性扩展智能体的状态。定义一个自定义状态类型并将其设置为 state_schema：

import { createMiddleware, createAgent, HumanMessage } from "langchain";
import * as z from "zod";

// Middleware with custom state requirements
const callCounterMiddleware = createMiddleware({
  name: "CallCounterMiddleware",
  stateSchema: z.object({
    modelCallCount: z.number().default(0),
    userId: z.string().optional(),
  }),
  beforeModel: (state) => {
    // Access custom state properties
    if (state.modelCallCount > 10) {
      return { jumpTo: "end" };
    }
    return;
  },
  afterModel: (state) => {
    // Update custom state
    return { modelCallCount: state.modelCallCount + 1 };
  },
});

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: [...],
  middleware: [callCounterMiddleware] as const,
});

// TypeScript enforces required state properties
const result = await agent.invoke({
  messages: [new HumanMessage("Hello")],
  modelCallCount: 0, // Optional due to default value
  userId: "user-123", // Optional
});

上下文扩展

上下文属性是通过可运行配置传递的配置值。与状态不同，上下文是只读的，通常用于在执行过程中不发生变化的配置。中间件可以定义智能体配置必须满足的上下文要求：

import * as z from "zod";
import { createMiddleware, HumanMessage } from "langchain";

const rateLimitMiddleware = createMiddleware({
  name: "RateLimitMiddleware",
  contextSchema: z.object({
    maxRequestsPerMinute: z.number(),
    apiKey: z.string(),
  }),
  beforeModel: async (state, runtime) => {
    // Access context through runtime
    const { maxRequestsPerMinute, apiKey } = runtime.context;

    // Implement rate limiting logic
    const allowed = await checkRateLimit(apiKey, maxRequestsPerMinute);
    if (!allowed) {
      return { jumpTo: "END" };
    }

    return state;
  },
});

// Context is provided through config
await agent.invoke(
  { messages: [new HumanMessage("Process data")] },
  {
    context: {
      maxRequestsPerMinute: 60,
      apiKey: "api-key-123",
    },
  }
);

执行顺序

当使用多个中间件时，理解执行顺序很重要：

const agent = createAgent({
  model: "openai:gpt-4o",
  middleware: [middleware1, middleware2, middleware3],
  tools: [...],
});

Execution flow (click to expand)

在钩子按顺序运行之前：

middleware1.before_agent()
middleware2.before_agent()
middleware3.before_agent()

智能体循环开始

middleware1.before_model()
middleware2.before_model()
middleware3.before_model()

像函数调用一样嵌套使用 hooks：

middleware1.wrap_model_call() → middleware2.wrap_model_call() → middleware3.wrap_model_call() → 模型

在钩子按反向顺序执行之后：

middleware3.after_model()
middleware2.after_model()
middleware1.after_model()

智能体循环结束

middleware3.after_agent()
middleware2.after_agent()
middleware1.after_agent()

关键规则：

before_* 钩子：从第一个到最后一个
after_* 钩子：从最后一个到第一个（反向）
wrap_* 钩子：嵌套（第一个中间件包裹所有其他中间件）

智能体跳跃

为了从中间件中提前退出，返回一个包含 jump_to 的字典：

import { createMiddleware, AIMessage } from "langchain";

const earlyExitMiddleware = createMiddleware({
  name: "EarlyExitMiddleware",
  beforeModel: (state) => {
    // Check some condition
    if (shouldExit(state)) {
      return {
        messages: [new AIMessage("Exiting early due to condition.")],
        jumpTo: "end",
      };
    }
    return;
  },
});

可用跳转目标：

"end"：跳转到智能体执行的末尾
"tools"：跳转到工具节点
"model"：跳转到模型节点（或第一个 before_model 钩子）

重要： 当从 before_model 或 after_model 跳转时，跳转到 "model" 将导致所有 before_model 中间件重新运行。要启用跳转功能，请用 @hook_config(can_jump_to=[...]) 装饰您的钩子：

import { createMiddleware } from "langchain";

const conditionalMiddleware = createMiddleware({
  name: "ConditionalMiddleware",
  afterModel: (state) => {
    if (someCondition(state)) {
      return { jumpTo: "end" };
    }
    return;
  },
});

最佳实践

保持中间件专注 - 每个中间件都应该擅长做一件事
优雅地处理错误 - 不要让中间件错误导致智能体崩溃
使用适当的钩子类型：
- 适用于顺序逻辑（日志记录、验证）的Node风格
- 适用于控制流（重试、回退、缓存）的包装风格
明确记录任何自定义状态属性
在集成前独立对中间件进行单元测试
考虑执行顺序 - 将关键中间件放在列表的第一位
尽可能使用内置中间件，不要重新发明轮子 :)

示例

动态选择工具

在运行时选择相关工具以提升性能和准确性。

优点：

更短的提示 - 通过仅展示相关工具来降低复杂性
更高的准确性 - 模型从更少的选项中选择正确答案
权限控制 - 根据用户访问动态过滤工具

import { createAgent, createMiddleware } from "langchain";

const toolSelectorMiddleware = createMiddleware({
  name: "ToolSelector",
  wrapModelCall: (request, handler) => {
    // Select a small, relevant subset of tools based on state/context
    const relevantTools = selectRelevantTools(request.state, request.runtime);
    const modifiedRequest = { ...request, tools: relevantTools };
    return handler(modifiedRequest);
  },
});

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: allTools, // All available tools need to be registered upfront
  // Middleware can be used to select a smaller subset that's relevant for the given run.
  middleware: [toolSelectorMiddleware],
});

Show Extended example: GitHub vs GitLab tool selection

import * as z from "zod";
import { createAgent, createMiddleware, tool, HumanMessage } from "langchain";

const githubCreateIssue = tool(
  async ({ repo, title }) => ({
    url: `https://github.com/${repo}/issues/1`,
    title,
  }),
  {
    name: "github_create_issue",
    description: "Create an issue in a GitHub repository",
    schema: z.object({ repo: z.string(), title: z.string() }),
  }
);

const gitlabCreateIssue = tool(
  async ({ project, title }) => ({
    url: `https://gitlab.com/${project}/-/issues/1`,
    title,
  }),
  {
    name: "gitlab_create_issue",
    description: "Create an issue in a GitLab project",
    schema: z.object({ project: z.string(), title: z.string() }),
  }
);

const allTools = [githubCreateIssue, gitlabCreateIssue];

const toolSelector = createMiddleware({
  name: "toolSelector",
  contextSchema: z.object({ provider: z.enum(["github", "gitlab"]) }),
  wrapModelCall: (request, handler) => {
    const provider = request.runtime.context.provider;
    const toolName = provider === "gitlab" ? "gitlab_create_issue" : "github_create_issue";
    const selectedTools = request.tools.filter((t) => t.name === toolName);
    const modifiedRequest = { ...request, tools: selectedTools };
    return handler(modifiedRequest);
  },
});

const agent = createAgent({
  model: "openai:gpt-4o",
  tools: allTools,
  middleware: [toolSelector],
});

// Invoke with GitHub context
await agent.invoke(
  {
    messages: [
      new HumanMessage("Open an issue titled 'Bug: where are the cats' in the repository `its-a-cats-game`"),
    ],
  },
  {
    context: { provider: "github" },
  }
);

关键点：

预先注册所有工具
中间件根据每个请求选择相关子集
使用 contextSchema 进行配置要求

其他资源

中间件API参考 - 自定义中间件的完整指南
人机协作 - 为敏感操作添加人工审核
测试智能体 - 测试安全机制的战略

在GitHub上编辑此页面的源代码。

通过MCP将这些文档编程连接到Claude、VSCode等，以获取实时答案。

LangChain v1.0

开始使用

核心组件

高级用法

生产环境使用

中间件能做什么？

Monitor

Modify

Control

Enforce

内置中间件

摘要

人工增强循环

Anthropic 提示缓存

模型调用限制

工具调用限制

模型回退

个人信息识别

规划

LLM 工具选择器

上下文编辑

自定义中间件

基于类的中间件

两种钩子样式

Node-style hooks

Wrap-style hooks

节点式钩子

包装式钩子

自定义状态模式

上下文扩展

执行顺序

智能体跳跃

最佳实践

示例

动态选择工具

其他资源

LangChain v1.0

开始使用

核心组件

高级用法

生产环境使用

​中间件能做什么？

Monitor

Modify

Control

Enforce

​内置中间件

​摘要

​人工增强循环

​Anthropic 提示缓存

​模型调用限制

​工具调用限制

​模型回退

​个人信息识别

​规划

​LLM 工具选择器

​上下文编辑

​自定义中间件

​基于类的中间件

​两种钩子样式

Node-style hooks

Wrap-style hooks

​节点式钩子

​包装式钩子

​自定义状态模式

​上下文扩展

​执行顺序

​智能体跳跃

​最佳实践

​示例

​动态选择工具

​其他资源

中间件能做什么？

内置中间件

摘要

人工增强循环

Anthropic 提示缓存

模型调用限制

工具调用限制

模型回退

个人信息识别

规划

LLM 工具选择器

上下文编辑

自定义中间件

基于类的中间件

两种钩子样式

节点式钩子

包装式钩子

自定义状态模式

上下文扩展

执行顺序

智能体跳跃

最佳实践

示例

动态选择工具

其他资源