VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy

📄 Abstract

Large language models (LLMs) have been widely deployed in Conversational AIs (CAIs), while exposing privacy and security threats. Recent research shows that LLM-based CAIs can be manipulated to extract private information from human users, posing serious security threats. However, the methods proposed in that study rely on a white-box setting that adversaries can directly modify the system prompt. This condition is unlikely to hold in real-world deployments. The limitation raises a critical question: can unprivileged attackers still induce such privacy risks in practical LLM-integrated applications? To address this question, we propose \textsc{VortexPIA}, a novel indirect prompt injection attack that induces privacy extraction in LLM-integrated applications under black-box settings. By injecting token-efficient data containing false memories, \textsc{VortexPIA} misleads LLMs to actively request private information in batches. Unlike prior methods, \textsc{VortexPIA} allows attackers to flexibly define multiple categories of sensitive data. We evaluate \textsc{VortexPIA} on six LLMs, covering both traditional and reasoning LLMs, across four benchmark datasets. The results show that \textsc{VortexPIA} significantly outperforms baselines and achieves state-of-the-art (SOTA) performance. It also demonstrates efficient privacy requests, reduced token consumption, and enhanced robustness against defense mechanisms. We further validate \textsc{VortexPIA} on multiple realistic open-source LLM-integrated applications, demonstrating its practical effectiveness.

🔍 Key Points

Introduction of VortexPIA, a novel indirect prompt injection attack targeting LLMs under black-box settings, which seeks to extract user privacy effectively.
VortexPIA demonstrates superior performance compared to existing methods, achieving state-of-the-art (SOTA) results across multiple LLMs and datasets, with a significant increase in attack success rate and reduced token consumption.
The research evaluates the robustness of VortexPIA against common defense mechanisms, showing improved resilience in eliciting private information from users.
In-depth analysis links the reasoning capabilities of LLMs with their vulnerability to privacy extraction attacks, suggesting that LLMs with advanced reasoning are more likely to expose user data.

💡 Why This Paper Matters

The paper presents VortexPIA, an advanced technique that underscores the potential privacy threats posed by LLMs in real-world applications, especially in black-box environments. It fills an important research gap by illustrating how unprivileged attackers can exploit LLMs to solicit sensitive user data, highlighting the necessity of enhanced security measures in AI systems.

🎯 Why It's Interesting for AI Security Researchers

This paper is particularly relevant to AI security researchers as it reveals critical vulnerabilities within language models that could be leveraged for privacy breaches. Understanding these attack vectors enables researchers to develop better defense strategies, contributing to the overall security architecture around LLMs and their integration in applications.

VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper