什么是大型语言模型?
What is a large language model?
大型语言模型 (LLM) 是一种 AI 算法,可以处理用户输入并通过预测单词序列来创建合理的响应。他们在庞大的半公开数据集上进行训练,使用机器学习来分析语言的组成部分如何组合在一起。
(Large Language Models (LLMs) are AI algorithms that can process user inputs and create plausible responses by predicting sequences of words. They are trained on huge semi-public data sets, using machine learning to analyze how the component parts of language fit together.)
LLM 通常会提供一个聊天界面来接受用户输入,称为提示。允许的输入在一定程度上由输入验证规则控制。
(LLMs usually present a chat interface to accept user input, known as a prompt. The input allowed is controlled in part by input validation rules.)
LLM 在现代网站中可以有广泛的用例:
(LLMs can have a wide range of use cases in modern websites:)
*客户服务,例如虚拟助手。( Customer service, such as a virtual assistant.)
译本。(Translation.) SEO改进。(SEO improvement.) 分析用户生成的内容,例如跟踪页面评论的语气。(Analysis of user-generated content, for example to track the tone of on-page comments.)LLM 攻击和提示注入
LLM attacks and prompt injection
许多 Web LLM 攻击依赖于一种称为提示注入的技术。攻击者可以使用构建的提示来操纵 LLM 的输出。提示注入可能会导致 AI 执行超出其预期目的的操作,例如对敏感 API 进行错误调用或返回不符合其准则的内容。
(Many web LLM attacks rely on a technique known as prompt injection. This is where an attacker uses crafted prompts to manipulate an LLM's output. Prompt injection can result in the AI taking actions that fall outside of its intended purpose, such as making incorrect calls to sensitive APIs or returning content that does not correspond to its guidelines.)
检测 LLM 漏洞
Detecting LLM vulnerabilities
我们推荐的 LLM 漏洞检测方法是:
(Our recommended methodology for detecting LLM vulnerabilities is:)
*识别 LLM 的输入,包括直接(如提示)和间接(如训练数据)输入。( Identify the LLM's inputs, including both direct (such as a prompt) and indirect (such as training data) inputs.)
弄清楚 LLM 可以访问哪些数据和 API。(Work out what data and APIs the LLM has access to.) 探测此新攻击面以查找漏洞。(Probe this new attack surface for vulnerabilities.)利用 LLM API、函数和插件
Exploiting LLM APIs, functions, and plugins
LLM 通常由专门的第三方提供商托管。网站可以通过描述 LLM 要使用的本地 API 来为第三方 LLM 提供对其特定功能的访问权限。(LLMs are often hosted by dedicated third party providers. A website can give third-party LLMs access to its specific functionality by describing local APIs for the LLM to use.)
例如,客户支持 LLM 可能有权访问管理用户、订单和库存的 API。(For example, a customer support LLM might have access to APIs that manage users, orders, and stock.)
LLM API 的工作原理
How LLM APIs work
将 LLM 与 API 集成的工作流取决于 API 本身的结构。在调用外部 API 时,某些 LLM 可能需要客户端调用单独的函数端点(实际上是私有 API),以便生成可以发送到这些 API 的有效请求。此工作流可能如下所示:
(The workflow for integrating an LLM with an API depends on the structure of the API itself. When calling external APIs, some LLMs may require the client to call a separate function endpoint (effectively a private API) in order to generate valid requests that can be sent to those APIs. The workflow for this could look something like the following:)
此工作流可能会带来安全隐患,因为 LLM 实际上代表用户调用外部 API,但用户可能不知道正在调用这些 API。理想情况下,在 LLM 调用外部 API 之前,应向用户提供确认步骤。
(This workflow can have security implications, as the LLM is effectively calling external APIs on behalf of the user but the user may not be aware that these APIs are being called. Ideally, users should be presented with a confirmation step before the LLM calls the external API.)
映射 LLM API 攻击面
Mapping LLM API attack surface
术语“过度代理”是指 LLM 可以访问 API 的情况,这些 API 可以访问敏感信息,并且可以被说服不安全地使用这些 API。这使攻击者能够将 LLM 推到其预期范围之外,并通过其 API 发起攻击。
(The term "excessive agency" refers to a situation in which an LLM has access to APIs that can access sensitive information and can be persuaded to use those APIs unsafely. This enables attackers to push the LLM beyond its intended scope and launch attacks via its APIs.)
使用 LLM 攻击 API 和插件的第一阶段是确定 LLM 可以访问哪些 API 和插件。一种方法是简单地询问 LLM 它可以访问哪些 API。然后,您可以询问有关任何感兴趣的 API 的其他详细信息。
(The first stage of using an LLM to attack APIs and plugins is to work out which APIs and plugins the LLM has access to. One way to do this is to simply ask the LLM which APIs it can access. You can then ask for additional details on any APIs of interest.)
如果 LLM 不合作,请尝试提供误导性的上下文并重新提出问题。例如,您可以声称自己是 LLM 的开发人员,因此应该拥有更高级别的权限。
(If the LLM isn't cooperative, try providing misleading context and re-asking the question. For example, you could claim that you are the LLM's developer and so should have a higher level of privilege.)