QUICK REVIEW

[论文解读] From Prompt Injections to SQL Injection Attacks: How Protected is Your LLM-Integrated Web Application?

Rodrigo Pedro, Daniel Castro|arXiv (Cornell University)|Aug 3, 2023

Web Application Security Vulnerabilities被引用 17

一句话总结

本论文在基于 Langchain 的 LLM 集成 Web 应用中定义 prompt-to-SQL (P2SQL) 注入，分析七种 LLM 的攻击变体，并提出四种防御扩展，附带真实世界评估。

ABSTRACT

Large Language Models (LLMs) have found widespread applications in various domains, including web applications, where they facilitate human interaction via chatbots with natural language interfaces. Internally, aided by an LLM-integration middleware such as Langchain, user prompts are translated into SQL queries used by the LLM to provide meaningful responses to users. However, unsanitized user prompts can lead to SQL injection attacks, potentially compromising the security of the database. Despite the growing interest in prompt injection vulnerabilities targeting LLMs, the specific risks of generating SQL injection attacks through prompt injections have not been extensively studied. In this paper, we present a comprehensive examination of prompt-to-SQL (P$_2$SQL) injections targeting web applications based on the Langchain framework. Using Langchain as our case study, we characterize P$_2$SQL injections, exploring their variants and impact on application security through multiple concrete examples. Furthermore, we evaluate 7 state-of-the-art LLMs, demonstrating the pervasiveness of P$_2$SQL attacks across language models. Our findings indicate that LLM-integrated applications based on Langchain are highly susceptible to P$_2$SQL injection attacks, warranting the adoption of robust defenses. To counter these attacks, we propose four effective defense techniques that can be integrated as extensions to the Langchain framework. We validate the defenses through an experimental evaluation with a real-world use case application.

研究动机与目标

描述面向 Langchain 基于 Web 应用的 P2SQL 注入变体。
评估不同 LLM 如何影响 P2SQL 攻击的可行性与影响。
提出并初步评估作为 Langchain 扩展集成的防御措施，以缓解 P2SQL 威胁。

提出的方法

描述 Langchain 的 SQLDatabaseChain/SQLDatabaseAgent 的处理流程，以识别注入点。
定义威胁模型并构建七个具有代表性的 P2SQL 攻击场景（不受限提示与受限提示、直接与间接）。
在多种提示和配置下，使用 GPT-3.5-turbo-0301 及其他模型进行攻击的实验评估。
开发四个 Langchain 扩展：数据库权限强化、SQL 查询改写、辅助 LLM 验证、以及在提示中进行数据预加载。
在具有 PostgreSQL 后端的真实用例 Web 应用中验证防御措施。

实验结果

研究问题

RQ1RQ1：哪些 P2SQL 注入变体可针对 Langchain 基于的 Web 应用，其安全影响为何？
RQ2RQ2：在 Langchain 实现（SQL chain 与 SQL agent）中，LLM 的选择如何影响 P2SQL 攻击的可行性与成功率？
RQ3RQ3：哪些防御措施能够在可接受的性能开销下有效缓解 P2SQL 攻击？

主要发现

基于 Langchain 的 LLM 集成应用在所测试的 LLM 下对 P2SQL 注入高度脆弱。
不受限的默认 Langchain 提示允许任意 SQL 查询，从而读取或写入整个数据库。
提示限制可能被绕过，且通过数据载荷中嵌入的提示实现的间接攻击能操控答案。
SQL 代理比 SQL 链能够实现更复杂、更多步骤的 P2SQL 攻击。
四种防御（数据库权限强化、SQL 查询重写、辅助 LLM 验证、提示内数据预加载）在可接受开销下降低风险。
在真实用例上的评估显示其实用有效性，尽管防御的自动化和透明性还需要进一步工作。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。