Skip to main content
智能体执行的关键节点进行内容验证和过滤,以帮助您构建安全、合规的AI应用。它们可以检测敏感信息、执行内容策略、验证输出,并在它们造成问题之前防止不安全行为。 常见用例包括:
  • 防止个人身份信息(PII)泄露
  • 检测并阻止提示注入攻击
  • 阻止不适当或有害内容
  • 执行业务规则和合规性要求
  • 验证输出质量和准确性
您可以使用中间件来实现守卫,在关键点拦截执行 - 在智能体开始之前、完成之后,或者在模型和工具调用周围。
Middleware流程图
围栏可以通过两种互补的方法来实现:

Deterministic guardrails

使用基于规则的逻辑,如正则表达式模式、关键字匹配或显式检查。快速、可预测且成本低,但可能会错过细微的违规行为。

Model-based guardrails

使用大型语言模型或分类器对内容进行语义理解评估。捕捉规则遗漏的微妙问题,但速度较慢且成本更高。
LangChain提供了内置的防护措施(例如,个人身份信息检测人工审核),以及一个灵活的中间件系统,用于使用任一方法构建自定义防护措施。

内置安全措施

个人信息识别

LangChain 提供了用于检测和处理对话中个人身份信息(PII)的内置中间件。此中间件可以检测常见的 PII 类型,如电子邮件、信用卡、IP 地址等。 个人身份信息检测中间件对于需要合规性要求的医疗保健和金融应用、需要清理日志的客户服务代理以及处理敏感用户数据的任何应用程序都非常有用。 PII 中间件支持多种处理检测到的个人身份信息的策略:
策略描述示例
redact替换为 [REDACTED_TYPE][REDACTED_EMAIL]
mask部分模糊(例如,最后4位数字)****-****-****-1234
hash替换为确定性哈希a8f5f167...
block检测到时抛出异常抛出错误
from langchain.agents import create_agent
from langchain.agents.middleware import PIIMiddleware


agent = create_agent(
    model="openai:gpt-4o",
    tools=[customer_service_tool, email_tool],
    middleware=[
        # Redact emails in user input before sending to model
        PIIMiddleware(
            "email",
            strategy="redact",
            apply_to_input=True,
        ),
        # Mask credit cards in user input
        PIIMiddleware(
            "credit_card",
            strategy="mask",
            apply_to_input=True,
        ),
        # Block API keys - raise error if detected
        PIIMiddleware(
            "api_key",
            detector=r"sk-[a-zA-Z0-9]{32}",
            strategy="block",
            apply_to_input=True,
        ),
    ],
)

# When user provides PII, it will be handled according to the strategy
result = agent.invoke({
    "messages": [{"role": "user", "content": "My email is john.doe@example.com and card is 4532-1234-5678-9010"}]
})
内置PII类型:
  • email - 电子邮件地址
  • credit_card - 信用卡号码(Luhn验证)
  • ip - IP地址
  • mac_address - MAC地址
  • url - 网址
配置选项:
参数描述默认值
pii_type要检测的PII类型(内置或自定义)必需
strategy如何处理检测到的PII("block""redact""mask""hash""redact"
detector自定义检测函数或正则表达式模式None(使用内置)
apply_to_input在模型调用前检查用户消息True
apply_to_output在模型调用后检查AI消息False
apply_to_tool_results在执行后检查工具结果消息False
查看中间件文档以获取关于PII检测功能的完整详细信息。

人工参与循环

LangChain提供了在执行敏感操作前需要人工批准的内置中间件。这是高风险决策中最有效的安全措施之一。 人机交互中间件在金融交易和转账、删除或修改生产数据、向外部发送通讯以及任何具有重大商业影响的操作等场景中非常有用。
from langchain.agents import create_agent
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.checkpoint.memory import InMemorySaver
from langgraph.types import Command


agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_tool, send_email_tool, delete_database_tool],
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                # Require approval for sensitive operations
                "send_email": True,
                "delete_database": True,
                # Auto-approve safe operations
                "search": False,
            }
        ),
    ],
    # Persist the state across interrupts
    checkpointer=InMemorySaver(),
)

# Human-in-the-loop requires a thread ID for persistence
config = {"configurable": {"thread_id": "some_id"}}

# Agent will pause and wait for approval before executing sensitive tools
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Send an email to the team"}]},
    config=config
)

result = agent.invoke(
    Command(resume={"decisions": [{"type": "approve"}]}),
    config=config  # Same thread ID to resume the paused conversation
)
查看人机协同工作流程文档以获取关于实现审批工作流程的完整详细信息。

自定义安全栏

为了实现更复杂的防护措施,您可以在智能体执行前后创建自定义中间件。这使您能够完全控制验证逻辑、内容过滤和安全检查。

在智能体约束之前

使用“在智能体之前”的钩子在每个调用的开始处验证请求。这对于会话级别的检查很有用,例如身份验证、速率限制或阻止不适当的请求,在开始任何处理之前。
from typing import Any

from langchain.agents.middleware import AgentMiddleware, AgentState, hook_config
from langgraph.runtime import Runtime

class ContentFilterMiddleware(AgentMiddleware):
    """Deterministic guardrail: Block requests containing banned keywords."""

    def __init__(self, banned_keywords: list[str]):
        super().__init__()
        self.banned_keywords = [kw.lower() for kw in banned_keywords]

    @hook_config(can_jump_to=["end"])
    def before_agent(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
        # Get the first user message
        if not state["messages"]:
            return None

        first_message = state["messages"][0]
        if first_message.type != "human":
            return None

        content = first_message.content.lower()

        # Check for banned keywords
        for keyword in self.banned_keywords:
            if keyword in content:
                # Block execution before any processing
                return {
                    "messages": [{
                        "role": "assistant",
                        "content": "I cannot process requests containing inappropriate content. Please rephrase your request."
                    }],
                    "jump_to": "end"
                }

        return None

# Use the custom guardrail
from langchain.agents import create_agent

agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_tool, calculator_tool],
    middleware=[
        ContentFilterMiddleware(
            banned_keywords=["hack", "exploit", "malware"]
        ),
    ],
)

# This request will be blocked before any processing
result = agent.invoke({
    "messages": [{"role": "user", "content": "How do I hack into a database?"}]
})

智能体安全限制之后

使用“智能体之后”钩子来在返回给用户之前验证最终输出。这对于基于模型的安全性检查、质量验证或对完整智能体响应的最终合规性扫描非常有用。
from langchain.agents.middleware import AgentMiddleware, AgentState, hook_config
from langgraph.runtime import Runtime
from langchain_core.messages import AIMessage
from langchain.chat_models import init_chat_model
from typing import Any

class SafetyGuardrailMiddleware(AgentMiddleware):
    """Model-based guardrail: Use an LLM to evaluate response safety."""

    def __init__(self):
        super().__init__()
        self.safety_model = init_chat_model("openai:gpt-4o-mini")

    @hook_config(can_jump_to=["end"])
    def after_agent(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
        # Get the final AI response
        if not state["messages"]:
            return None

        last_message = state["messages"][-1]
        if not isinstance(last_message, AIMessage):
            return None

        # Use a model to evaluate safety
        safety_prompt = f"""Evaluate if this response is safe and appropriate.
        Respond with only 'SAFE' or 'UNSAFE'.

        Response: {last_message.content}"""

        result = self.safety_model.invoke([{"role": "user", "content": safety_prompt}])

        if "UNSAFE" in result.content:
            return {
                "messages": [{
                    "role": "assistant",
                    "content": "I cannot provide that response. Please rephrase your request."
                }],
                "jump_to": "end"
            }

        return None

# Use the safety guardrail
from langchain.agents import create_agent

agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_tool, calculator_tool],
    middleware=[SafetyGuardrailMiddleware()],
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "How do I make explosives?"}]
})

组合多个安全栏

您可以通过将它们添加到中间件数组中来堆叠多个安全栏。它们按顺序执行,允许您构建分层保护:
from langchain.agents import create_agent
from langchain.agents.middleware import PIIMiddleware, HumanInTheLoopMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_tool, send_email_tool],
    middleware=[
        # Layer 1: Deterministic input filter (before agent)
        ContentFilterMiddleware(banned_keywords=["hack", "exploit"]),

        # Layer 2: PII protection (before and after model)
        PIIMiddleware("email", strategy="redact", apply_to_input=True),
        PIIMiddleware("email", strategy="redact", apply_to_output=True),

        # Layer 3: Human approval for sensitive tools
        HumanInTheLoopMiddleware(interrupt_on={"send_email": True}),

        # Layer 4: Model-based safety check (after agent)
        SafetyGuardrailMiddleware(),
    ],
)

其他资源