构建自定义RAG智能体

概述

在这个教程中，我们将使用LangGraph构建一个检索智能体。 LangChain 提供了内置的智能体实现，使用 LangGraph 原语实现。如果需要更深入的定制，可以直接在 LangGraph 中实现智能体。本指南演示了一个检索智能体的示例实现。检索智能体在您希望大型语言模型决定是否从向量存储中检索上下文或直接响应用户时非常有用。到教程结束时，我们将完成以下内容：

获取并预处理用于检索的文档。
对这些文档进行索引以支持语义搜索，并为智能体创建检索工具。
构建一个智能体RAG系统，该系统能够决定何时使用检索工具。

概念

我们将介绍以下概念：

检索使用文档加载器、文本分割器、嵌入和向量存储
LangGraph 图API，包括状态、节点、边和条件边。

安装

让我们下载所需的软件包并设置我们的API密钥：

npm install @langchain/langgraph @langchain/openai @langchain/community @langchain/textsplitters

注册LangSmith，快速发现并提升您的LangGraph项目性能。LangSmith允许您使用跟踪数据来调试、测试和监控使用LangGraph构建的LLM应用。

1. 预处理文档

从我们的RAG系统中获取文档。我们将使用Lilian Weng的出色博客中最新的三页。我们将首先使用CheerioWebBaseLoader获取页面内容：

import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";

const urls = [
  "https://lilianweng.github.io/posts/2023-06-23-agent/",
  "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
  "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
];

const docs = await Promise.all(
  urls.map((url) => new CheerioWebBaseLoader(url).load()),
);

将获取的文档分割成更小的块以索引到我们的向量存储中：

import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";

const docsList = docs.flat();

const textSplitter = new RecursiveCharacterTextSplitter({
  chunkSize: 500,
  chunkOverlap: 50,
});
const docSplits = await textSplitter.splitDocuments(docsList);

2. 创建检索工具

现在我们已经将文档分割好了，我们可以将它们索引到我们将用于语义搜索的向量存储中。

使用内存向量存储和OpenAI嵌入：

import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const vectorStore = await MemoryVectorStore.fromDocuments(
  docSplits,
  new OpenAIEmbeddings(),
);

const retriever = vectorStore.asRetriever();

使用LangChain预构建的createRetrieverTool创建检索工具：

import { createRetrieverTool } from "@langchain/classic/tools/retriever";

const tool = createRetrieverTool(
  retriever,
  {
    name: "retrieve_blog_posts",
    description:
      "Search and return information about Lilian Weng blog posts on LLM agents, prompt engineering, and adversarial attacks on LLMs.",
  },
);
const tools = [tool];

3. 生成查询

现在我们将开始构建我们的智能体RAG图组件（节点和边）。

构建一个 generateQueryOrRespond 节点。它将调用一个LLM（大型语言模型）根据当前图状态（消息列表）生成响应。给定输入消息，它将决定使用检索工具检索，还是直接响应用户。请注意，我们通过 .bindTools 给聊天模型访问我们之前创建的 tools：

import { ChatOpenAI } from "@langchain/openai";

async function generateQueryOrRespond(state) {
  const { messages } = state;
  const model = new ChatOpenAI({
    model: "gpt-4o",
    temperature: 0,
  }).bindTools(tools);  

  const response = await model.invoke(messages);
  return {
    messages: [response],
  };
}

在一个随机输入上尝试：

import { HumanMessage } from "@langchain/core/messages";

const input = { messages: [new HumanMessage("hello!")] };
const result = await generateQueryOrRespond(input);
console.log(result.messages[0]);

输出：

AIMessage {
  content: "Hello! How can I help you today?",
  tool_calls: []
}

提出一个需要语义搜索的问题：

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?")
  ]
};
const result = await generateQueryOrRespond(input);
console.log(result.messages[0]);

输出：

AIMessage {
  content: "",
  tool_calls: [
    {
      name: "retrieve_blog_posts",
      args: { query: "types of reward hacking" },
      id: "call_...",
      type: "tool_call"
    }
  ]
}

4. 评估文档

添加一个节点 — gradeDocuments — 以确定检索到的文档是否与问题相关。我们将使用Zod进行文档评分的具有结构化输出的模型。我们还将添加一个条件边 — checkRelevance — 检查评分结果并返回要前往的节点名称 (generate 或 rewrite)：

import * as z from "zod";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";
import { AIMessage } from "@langchain/core/messages";

const prompt = ChatPromptTemplate.fromTemplate(
  `You are a grader assessing relevance of retrieved docs to a user question.
  Here are the retrieved docs:
  \n ------- \n
  {context}
  \n ------- \n
  Here is the user question: {question}
  If the content of the docs are relevant to the users question, score them as relevant.
  Give a binary score 'yes' or 'no' score to indicate whether the docs are relevant to the question.
  Yes: The docs are relevant to the question.
  No: The docs are not relevant to the question.`,
);

const gradeDocumentsSchema = z.object({
  binaryScore: z.string().describe("Relevance score 'yes' or 'no'"),  
})

async function gradeDocuments(state) {
  const { messages } = state;

  const model = new ChatOpenAI({
    model: "gpt-4o",
    temperature: 0,
  }).withStructuredOutput(gradeDocumentsSchema);

  const score = await chain.invoke({
    question: messages.at(0)?.content,
    context: messages.at(-1)?.content,
  });

  if (score.binaryScore === "yes") {
    return "generate";
  }
  return "rewrite";
}

在工具响应中使用无关文档运行此操作：

const input = {
  messages: [
      new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
      new AIMessage({
          tool_calls: [
              {
                  type: "tool_call"
                  name: "retrieve_blog_posts",
                  args: { query: "types of reward hacking" },
                  id: "1",
              }
          ]
      }),
      new ToolMessage({
          content: "meow",
          tool_call_id: "1",
      })
  ]
}
const result = await gradeDocuments(input);

确认相关文档被分类为相关文档：

const input = {
  messages: [
      new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
      new AIMessage({
          tool_calls: [
              {
                  type: "tool_call"
                  name: "retrieve_blog_posts",
                  args: { query: "types of reward hacking" },
                  id: "1",
              }
          ]
      }),
      new ToolMessage({
          content: "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
          tool_call_id: "1",
      })
  ]
}
const result = await gradeDocuments(input);

5. 重新编写问题

构建 rewrite 节点。检索工具可能会返回一些可能不相关的文档，这表明需要改进原始用户问题。为此，我们将调用 rewrite 节点：

import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";

const rewritePrompt = ChatPromptTemplate.fromTemplate(
  `Look at the input and try to reason about the underlying semantic intent / meaning. \n
  Here is the initial question:
  \n ------- \n
  {question}
  \n ------- \n
  Formulate an improved question:`,
);

async function rewrite(state) {
  const { messages } = state;
  const question = messages.at(0)?.content;

  const model = new ChatOpenAI({
    model: "gpt-4o",
    temperature: 0,
  });

  const response = await rewritePrompt.pipe(model).invoke({ question });
  return {
    messages: [response],
  };
}

尝试以下操作：

import { HumanMessage, AIMessage, ToolMessage } from "@langchain/core/messages";

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
    new AIMessage({
      content: "",
      tool_calls: [
        {
          id: "1",
          name: "retrieve_blog_posts",
          args: { query: "types of reward hacking" },
          type: "tool_call"
        }
      ]
    }),
    new ToolMessage({ content: "meow", tool_call_id: "1" })
  ]
};

const response = await rewrite(input);
console.log(response.messages[0].content);

输出：

What are the different types of reward hacking described by Lilian Weng, and how does she explain them?

6. 生成答案

构建 generate 节点：如果我们通过了评分器的检查，我们可以根据原始问题和检索到的上下文生成最终答案：

import { ChatPromptTemplate } from "@langchain/core/prompts";
import { ChatOpenAI } from "@langchain/openai";

async function generate(state) {
  const { messages } = state;
  const question = messages.at(0)?.content;
  const context = messages.at(-1)?.content;

  const prompt = ChatPromptTemplate.fromTemplate(
  `You are an assistant for question-answering tasks.
      Use the following pieces of retrieved context to answer the question.
      If you don't know the answer, just say that you don't know.
      Use three sentences maximum and keep the answer concise.
      Question: {question}
      Context: {context}`
  );

  const llm = new ChatOpenAI({
    model: "gpt-4o",
    temperature: 0,
  });

  const ragChain = prompt.pipe(llm);

  const response = await ragChain.invoke({
    context,
    question,
  });

  return {
    messages: [response],
  };
}

尝试一下：

import { HumanMessage, AIMessage, ToolMessage } from "@langchain/core/messages";

const input = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?"),
    new AIMessage({
      content: "",
      tool_calls: [
        {
          id: "1",
          name: "retrieve_blog_posts",
          args: { query: "types of reward hacking" },
          type: "tool_call"
        }
      ]
    }),
    new ToolMessage({
      content: "reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering",
      tool_call_id: "1"
    })
  ]
};

const response = await generate(input);
console.log(response.messages[0].content);

输出：

Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.

7. 组装图

现在我们将所有节点和边组装成一个完整的图：

从一个 generateQueryOrRespond 开始，并确定是否需要调用检索工具
使用条件边路由到下一步：
- 如果 generateQueryOrRespond 返回 tool_calls，则调用检索工具以检索上下文
- 否则，直接响应用户
对检索到的文档内容进行评分，以确定其与问题的相关性 (gradeDocuments)，并将其路由到下一步：
- 如果不相关，使用 rewrite 重新编写问题，然后再次调用 generateQueryOrRespond
- 如果相关，继续到 generate 并使用 @[ToolMessage] 与检索到的文档上下文生成最终响应

import { StateGraph, START, END } from "@langchain/langgraph";
import { ToolNode } from "@langchain/langgraph/prebuilt";
import { AIMessage } from "langchain";

// Create a ToolNode for the retriever
const toolNode = new ToolNode(tools);

// Helper function to determine if we should retrieve
function shouldRetrieve(state) {
  const { messages } = state;
  const lastMessage = messages.at(-1);

  if (AIMessage.isInstance(lastMessage) && lastMessage.tool_calls.length) {
    return "retrieve";
  }
  return END;
}

// Define the graph
const builder = new StateGraph(GraphState)
  .addNode("generateQueryOrRespond", generateQueryOrRespond)
  .addNode("retrieve", toolNode)
  .addNode("gradeDocuments", gradeDocuments)
  .addNode("rewrite", rewrite)
  .addNode("generate", generate)
  // Add edges
  .addEdge(START, "generateQueryOrRespond")
  // Decide whether to retrieve
  .addConditionalEdges("generateQueryOrRespond", shouldRetrieve)
  .addEdge("retrieve", "gradeDocuments")
  // Edges taken after grading documents
  .addConditionalEdges(
    "gradeDocuments",
    // Route based on grading decision
    (state) => {
      // The gradeDocuments function returns either "generate" or "rewrite"
      const lastMessage = state.messages.at(-1);
      return lastMessage.content === "generate" ? "generate" : "rewrite";
    }
  )
  .addEdge("generate", END)
  .addEdge("rewrite", "generateQueryOrRespond");

// Compile
const graph = builder.compile();

8. 运行智能体RAG

现在让我们通过运行一个问题来测试完整的图。

import { HumanMessage } from "@langchain/core/messages";

const inputs = {
  messages: [
    new HumanMessage("What does Lilian Weng say about types of reward hacking?")
  ]
};

for await (const output of await graph.stream(inputs)) {
  for (const [key, value] of Object.entries(output)) {
    const lastMsg = output[key].messages[output[key].messages.length - 1];
    console.log(`Output from node: '${key}'`);
    console.log({
      type: lastMsg._getType(),
      content: lastMsg.content,
      tool_calls: lastMsg.tool_calls,
    });
    console.log("---\n");
  }
}

输出：

Output from node: 'generateQueryOrRespond'
{
  type: 'ai',
  content: '',
  tool_calls: [
    {
      name: 'retrieve_blog_posts',
      args: { query: 'types of reward hacking' },
      id: 'call_...',
      type: 'tool_call'
    }
  ]
}
---

Output from node: 'retrieve'
{
  type: 'tool',
  content: '(Note: Some work defines reward tampering as a distinct category...\n' +
    'At a high level, reward hacking can be categorized into two types: environment or goal misspecification, and reward tampering.\n' +
    '...',
  tool_calls: undefined
}
---

Output from node: 'generate'
{
  type: 'ai',
  content: 'Lilian Weng categorizes reward hacking into two types: environment or goal misspecification, and reward tampering. She considers reward hacking as a broad concept that includes both of these categories. Reward hacking occurs when an agent exploits flaws or ambiguities in the reward function to achieve high rewards without performing the intended behaviors.',
  tool_calls: []
}
---

在GitHub上编辑此页面的源代码.

通过MCP将这些文档编程连接到Claude、VSCode等，以获取实时答案。

教程

概念概述

LangChain 学院

其他资源

概述

概念

安装

1. 预处理文档

2. 创建检索工具

3. 生成查询

4. 评估文档

5. 重新编写问题

6. 生成答案

7. 组装图

8. 运行智能体RAG

教程

概念概述

LangChain 学院

其他资源

​概述

​概念

​安装

​1. 预处理文档

​2. 创建检索工具

​3. 生成查询

​4. 评估文档

​5. 重新编写问题

​6. 生成答案

​7. 组装图

​8. 运行智能体RAG

概述

概念

安装

1. 预处理文档

2. 创建检索工具

3. 生成查询

4. 评估文档

5. 重新编写问题

6. 生成答案

7. 组装图

8. 运行智能体RAG