Prompt Injection attack against LLM-integrated Applications

Authors: Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, Yang Liu

Published: 2023-06-08

arXiv ID: 2306.05499v2

Added to Library: 2025-11-11 14:02 UTC

Red Teaming

📄 Abstract

Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation.

🔍 Key Points

Introduction of HouYi, a pioneering black-box prompt injection attack methodology targeting LLM-integrated applications, emphasizing systematic exploitation techniques.
Detailed pilot study outcomes exposing limitations of existing prompt injection methods and enhanced understanding of application vulnerabilities.
The framework includes three critical components (Framework, Separator, and Disruptor) enabling effective prompt injections by manipulating application context and bypassing defenses.
Successful deployment of HouYi on 36 applications, revealing 31 susceptible to prompt injections, thus indicating widespread vulnerabilities in LLM-integrated services.
Found significant implications for application security, including potential financial loss for service providers due to unauthorized LLM usage and prompt theft.

💡 Why This Paper Matters

This paper presents critical insights into the real-world feasibility of prompt injection attacks on LLM-integrated applications through both novel methodologies and comprehensive experimental validation. The findings underscore significant security vulnerabilities, drawing attention to the urgent need for stronger defensive measures in the integration of AI technologies.

🎯 Why It's Interesting for AI Security Researchers

The research is of particular interest to AI security researchers as it not only identifies and characterizes new attack vectors within LLM-integrated systems but also provides a systematic framework that can be utilized for understanding and formulating defenses against these kinds of vulnerabilities. Furthermore, the implications for industry standards and practices in securing AI applications resonate with ongoing efforts to enhance the robustness of AI technologies.

Prompt Injection attack against LLM-integrated Applications

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper