检索 - Docs by LangChain

大型语言模型（LLMs）功能强大，但它们有两个关键限制：

有限上下文 — 它们无法一次性摄入整个语料库。
静态知识 — 它们的训练数据冻结在某个时间点。

检索通过在查询时获取相关外部知识来解决这些问题。这是**检索增强生成（RAG）**的基础：通过上下文特定的信息增强大型语言模型（LLM）的答案。

构建知识库

知识库是用于检索的文档或结构化数据的存储库。如果您需要自定义知识库，可以使用LangChain的文档加载器和向量存储从您自己的数据中构建一个。

如果您已经拥有一个知识库（例如，SQL数据库、CRM或内部文档系统），您不需要重新构建它。您可以选择以下操作：

将其作为Agentic RAG中智能体的工具连接。
查询它，并将检索到的内容作为LLM的上下文提供 (2-步RAG)。

查看以下教程以构建可搜索的知识库和最小化RAG工作流程：

Tutorial: Semantic search

学习如何使用LangChain的文档加载器、嵌入和向量存储从您自己的数据中创建可搜索的知识库。在本教程中，您将构建一个覆盖PDF的搜索引擎，实现查询相关段落的检索。您还将在此引擎之上实现一个最小化的RAG工作流程，以了解外部知识如何集成到LLM推理中。

从检索到RAG

检索允许LLM在运行时访问相关上下文。但大多数实际应用更进一步：它们将检索与生成集成，以产生基于现实、具有上下文意识的答案。这是**检索增强生成（RAG）**背后的核心思想。检索管道成为了一个更广泛系统的基石，该系统将搜索与生成相结合。

检索管道

一个典型的检索工作流程看起来是这样的：每个组件都是模块化的：您可以在不重写应用程序逻辑的情况下交换加载器、分割器、嵌入或向量存储。

构建模块

Document loaders

从外部来源（Google Drive、Slack、Notion 等）摄取数据，返回标准化的 Document 对象。

Embedding models

将文本转换为数字向量，使得具有相似意义的文本在该向量空间中靠近。

Vector stores

用于存储和搜索嵌入体的专用数据库。

Retrievers

检索器是一个接口，它根据非结构化查询返回文档。

RAG 架构

RAG可以根据您系统的需求以多种方式实现。我们将在下面的章节中概述每种类型。

架构	描述	控制	灵活性	延迟	示例用例
两步 RAG	检索总是在生成之前发生。简单且可预测	✅ 高	❌ 低	⚡ 快	常见问题解答，文档机器人
智能体 RAG	由 LLM 驱动的智能体决定推理过程中何时以及如何进行检索	❌ 低	✅ 高	⏳ 可变	可访问多个工具的研究助手
混合	结合两种方法的特点，并包含验证步骤	⚖️ 中等	⚖️ 中等	⏳ 可变	具有质量验证的特定领域问答系统

延迟：在两步RAG中，延迟通常更可预测，因为已知的最大LLM调用次数是有限的。这种可预测性假设LLM推理时间是主导因素。然而，实际延迟也可能受到检索步骤性能的影响——例如API响应时间、网络延迟或数据库查询，这些可能因使用的工具和基础设施而异。

2步RAG

在2-步RAG中，检索步骤始终在生成步骤之前执行。这种架构简单直观，易于预测，适用于许多应用，在这些应用中，检索相关文档是生成答案的明确先决条件。

Tutorial: Retrieval-Augmented Generation (RAG)

查看如何构建一个基于检索增强生成（RAG）的问答聊天机器人，该机器人可以使用您的数据进行问答。本教程介绍了两种方法：

一种 RAG智能体，使用灵活的工具进行搜索——非常适合通用用途。
一种 两步RAG链，每个查询只需一次LLM调用——对于简单任务来说既快又高效。

智能体RAG

智能体增强检索生成（Agentic Retrieval-Augmented Generation，RAG） 结合了检索增强生成和基于智能体的推理的优势。与在回答之前检索文档不同，智能体（由大型语言模型驱动）逐步推理，并决定在交互过程中何时以及如何检索信息。

智能体要启用RAG行为，唯一需要的是访问一个或多个可以检索外部知识的工具——例如文档加载器、Web API或数据库查询。

import { tool, createAgent, initChatModel } from "langchain";

const fetchUrl = tool(
    (url: string) => {
        return `Fetched content from ${url}`;
    },
    { name: "fetch_url", description: "Fetch text content from a URL" }
);

const agent = createAgent({
    model: "claude-sonnet-4-0",
    tools: [fetchUrl],
    systemPrompt,
});

Show Extended example: Agentic RAG for LangGraph's llms.txt

本示例实现了一个智能体RAG系统，以帮助用户查询LangGraph文档。智能体首先加载llms.txt，该文件列出了可用的文档URL，然后可以根据用户的问题动态使用fetch_documentation工具检索和处理相关内容。

import { tool, createAgent, initChatModel, HumanMessage } from "langchain";
import * as z from "zod";

const ALLOWED_DOMAINS = ["https://langchain-ai.github.io/"];
const LLMS_TXT = "https://langchain-ai.github.io/langgraph/llms.txt";

const fetchDocumentation = tool(
  async (input) => {  
    if (!ALLOWED_DOMAINS.some((domain) => input.url.startsWith(domain))) {
      return `Error: URL not allowed. Must start with one of: ${ALLOWED_DOMAINS.join(", ")}`;
    }
    const response = await fetch(input.url);
    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }
    return response.text();
  },
  {
    name: "fetch_documentation",
    description: "Fetch and convert documentation from a URL",
    schema: z.object({
      url: z.string().describe("The URL of the documentation to fetch"),
    }),
  }
);

const llmsTxtResponse = await fetch(LLMS_TXT);
const llmsTxtContent = await llmsTxtResponse.text();

const systemPrompt = `
You are an expert TypeScript developer and technical assistant.
Your primary role is to help users with questions about LangGraph and related tools.

Instructions:

1. If a user asks a question you're unsure about — or one that likely involves API usage,
   behavior, or configuration — you MUST use the \`fetch_documentation\` tool to consult the relevant docs.
2. When citing documentation, summarize clearly and include relevant context from the content.
3. Do not use any URLs outside of the allowed domain.
4. If a documentation fetch fails, tell the user and proceed with your best expert understanding.

You can access official documentation from the following approved sources:

${llmsTxtContent}

You MUST consult the documentation to get up to date documentation
before answering a user's question about LangGraph.

Your answers should be clear, concise, and technically accurate.
`;

const tools = [fetchDocumentation];

const agent = createAgent({
  model: "claude-sonnet-4-0"
  tools,  
  systemPrompt,  
  name: "Agentic RAG",
});

const response = await agent.invoke({
  messages: [
    new HumanMessage(
      "Write a short example of a langgraph agent using the " +
      "prebuilt create react agent. the agent should be able " +
      "to look up stock pricing information."
    ),
  ],
});

console.log(response.messages.at(-1)?.content);

Tutorial: Retrieval-Augmented Generation (RAG)

查看如何构建一个基于检索增强生成（RAG）的问答聊天机器人，该机器人可以使用您的数据进行问答。本教程介绍了两种方法：

一种 RAG智能体，使用灵活的工具进行搜索——非常适合通用用途。
一种 两步RAG链，每个查询只需一次LLM调用——对于简单任务来说既快又高效。

混合RAG

混合RAG结合了2-Step和Agentic RAG的特点。它引入了中间步骤，如查询预处理、检索验证和生成后检查。这些系统比固定管道提供了更多的灵活性，同时保持对执行的一些控制。典型组件包括：

查询增强：修改输入问题以提高检索质量。这可能包括重写模糊查询、生成多个变体或通过添加额外上下文来扩展查询。
检索验证：评估检索到的文档的相关性和充分性。如果不满足要求，系统可能会优化查询并再次检索。
答案验证：检查生成的答案的准确性、完整性和与源内容的匹配度。如有需要，系统可以重新生成或修改答案。

架构通常支持在这些步骤之间进行多次迭代：此架构适用于：

含有模糊或不明确的查询的应用
需要验证或质量控制步骤的系统
涉及多个来源或迭代精炼的工作流程

Tutorial: Agentic RAG with Self-Correction

混合RAG的示例，它结合了智能体推理、检索和自我校正。

在GitHub上编辑此页面的源代码。

通过MCP将这些文档编程连接到Claude、VSCode等，以获取实时答案。

LangChain v1.0

开始使用

核心组件

高级用法

生产环境使用

检索

构建知识库

Tutorial: Semantic search

从检索到RAG

检索管道

构建模块

Document loaders

Embedding models

Vector stores

Retrievers

RAG 架构

2步RAG

Tutorial: Retrieval-Augmented Generation (RAG)

智能体RAG

Tutorial: Retrieval-Augmented Generation (RAG)

混合RAG

Tutorial: Agentic RAG with Self-Correction

LangChain v1.0

开始使用

核心组件

高级用法

生产环境使用

​构建知识库

Tutorial: Semantic search

​从检索到RAG

​检索管道

​构建模块

Document loaders

Embedding models

Vector stores

Retrievers

​RAG 架构

​2步RAG

Tutorial: Retrieval-Augmented Generation (RAG)

​智能体RAG

Tutorial: Retrieval-Augmented Generation (RAG)

​混合RAG

Tutorial: Agentic RAG with Self-Correction

构建知识库

从检索到RAG

检索管道

构建模块

RAG 架构

2步RAG

智能体RAG

混合RAG