PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

📄 Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of applications. However, their practical deployment is often hindered by issues such as outdated knowledge and the tendency to generate hallucinations. To address these limitations, Retrieval-Augmented Generation (RAG) systems have been introduced, enhancing LLMs with external, up-to-date knowledge sources. Despite their advantages, RAG systems remain vulnerable to adversarial attacks, with data poisoning emerging as a prominent threat. Existing poisoning-based attacks typically require prior knowledge of the user's specific queries, limiting their flexibility and real-world applicability. In this work, we propose PIDP-Attack, a novel compound attack that integrates prompt injection with database poisoning in RAG. By appending malicious characters to queries at inference time and injecting a limited number of poisoned passages into the retrieval database, our method can effectively manipulate LLM response to arbitrary query without prior knowledge of the user's actual query. Experimental evaluations across three benchmark datasets (Natural Questions, HotpotQA, MS-MARCO) and eight LLMs demonstrate that PIDP-Attack consistently outperforms the original PoisonedRAG. Specifically, our method improves attack success rates by 4% to 16% on open-domain QA tasks while maintaining high retrieval precision, proving that the compound attack strategy is both necessary and highly effective.

🔍 Key Points

Introduction of PIDP-Attack, a novel attack that combines prompt injection and database poisoning targeting Retrieval-Augmented Generation (RAG) systems, allowing manipulation of LLM outputs without prior knowledge of user queries.
Demonstrated improved attack success rates (ASR) ranging from 4% to 16% across various benchmark datasets (NQ, HotpotQA, MS-MARCO) when compared to existing attacks, emphasizing the effectiveness of combining two attack vectors.
Ablation studies reveal that both retrieval and instruction manipulation are critical for achieving high attack effectiveness, with insights on how success varies with different poisoning and context budgets.
Results indicate that the compound nature of PIDP-Attack is a significant improvement over prior methods that either relied on only data poisoning or prompt injection, reinforcing the need for robust defenses in RAG deployments.
The work emphasizes security implications for LLMs and RAG systems, discussing potential vulnerabilities and mitigation measures that stakeholders should consider.

💡 Why This Paper Matters

The paper presents a significant advancement in understanding and exploiting vulnerabilities in RAG systems through the PIDP-Attack mechanism. This novel approach not only highlights the risks associated with current implementations of LLMs but also provides empirical results that underscore the necessity of comprehensive security measures in deployment. By combining prompt injection with database poisoning, the research establishes a critical foundation for future studies on AI system security.

🎯 Why It's Interesting for AI Security Researchers

This paper is particularly relevant to AI security researchers due to its investigation into complex adversarial attacks against RAG systems, which are widely utilized in various applications. The PIDP-Attack not only exposes critical vulnerabilities inherent in these systems but also offers empirical evidence demonstrating the attack's effectiveness compared to existing methods. Understanding such attacks is crucial for developing better defense mechanisms and ensuring the safe deployment of AI technologies.

PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper