← Back to Library

When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins

Authors: Yigitcan Kaya, Anton Landerer, Stijn Pletinckx, Michelle Zimmermann, Christopher Kruegel, Giovanni Vigna

Published: 2025-11-08

arXiv ID: 2511.05797v1

Added to Library: 2025-11-11 14:24 UTC

Red Teaming

📄 Abstract

Prompt injection attacks pose a critical threat to large language models (LLMs), with prior work focusing on cutting-edge LLM applications like personal copilots. In contrast, simpler LLM applications, such as customer service chatbots, are widespread on the web, yet their security posture and exposure to such attacks remain poorly understood. These applications often rely on third-party chatbot plugins that act as intermediaries to commercial LLM APIs, offering non-expert website builders intuitive ways to customize chatbot behaviors. To bridge this gap, we present the first large-scale study of 17 third-party chatbot plugins used by over 10,000 public websites, uncovering previously unknown prompt injection risks in practice. First, 8 of these plugins (used by 8,000 websites) fail to enforce the integrity of the conversation history transmitted in network requests between the website visitor and the chatbot. This oversight amplifies the impact of direct prompt injection attacks by allowing adversaries to forge conversation histories (including fake system messages), boosting their ability to elicit unintended behavior (e.g., code generation) by 3 to 8x. Second, 15 plugins offer tools, such as web-scraping, to enrich the chatbot's context with website-specific content. However, these tools do not distinguish the website's trusted content (e.g., product descriptions) from untrusted, third-party content (e.g., customer reviews), introducing a risk of indirect prompt injection. Notably, we found that ~13% of e-commerce websites have already exposed their chatbots to third-party content. We systematically evaluate both vulnerabilities through controlled experiments grounded in real-world observations, focusing on factors such as system prompt design and the underlying LLM. Our findings show that many plugins adopt insecure practices that undermine the built-in LLM safeguards.

🔍 Key Points

  • The paper conducts the first large-scale study of vulnerabilities in third-party AI chatbot plugins, revealing prompt injection risks that can lead to significant exploit possibilities in real-world scenarios.
  • Identifies critical vulnerabilities including direct prompt injection via message history forging, allowing attackers to inject forged messages and manipulating chatbot responses, thus compromising security.
  • The research demonstrates that many plugins expose chatbots to indirect prompt injection risks from untrusted external content, revealing that around 13% of e-commerce websites have chatbots susceptible to such attacks.
  • Controlled experiments were performed to quantify the success of different prompt injection techniques across various configurations, showing how plugin design and context injection strategies impact security.
  • The authors propose two lightweight defenses that can be integrated into plugins to mitigate identified vulnerabilities: content sanitization for user-generated content and hardening tool instructions.

💡 Why This Paper Matters

This paper is critical as it not only highlights specific vulnerabilities in widely used AI chatbot plugins but also underscores the need for enhanced security in an ecosystem that facilitates advanced AI applications. By providing empirical evidence of existing risks, it serves as a call to action for developers and researchers to prioritize security in AI interactions.

🎯 Why It's Interesting for AI Security Researchers

The findings of this paper are particularly significant for AI security researchers because it presents a comprehensive analysis of security vulnerabilities that are often overlooked in mainstream AI deployments. It identifies specific risks and offers empirical data, which can guide future research and development of more secure AI systems, ensuring that the community is aware of potential attack vectors and mitigations in the rapidly evolving landscape of AI applications.

📚 Read the Full Paper