← Back to Library

AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs

Authors: Che Wang, Jiaming Zhang, Ziqi Zhang, Zijie Wang, Yinghui Wang, Jianbo Gao, Tao Wei, Zhong Chen, Wei Yang Bryan Lim

Published: 2026-02-24

arXiv ID: 2602.20720v1

Added to Library: 2026-02-25 03:01 UTC

Red Teaming

πŸ“„ Abstract

The integration of external data services (e.g., Model Context Protocol, MCP) has made large language model-based agents increasingly powerful for complex task execution. However, this advancement introduces critical security vulnerabilities, particularly indirect prompt injection (IPI) attacks. Existing attack methods are limited by their reliance on static patterns and evaluation on simple language models, failing to address the fast-evolving nature of modern AI agents. We introduce AdapTools, a novel adaptive IPI attack framework that selects stealthier attack tools and generates adaptive attack prompts to create a rigorous security evaluation environment. Our approach comprises two key components: (1) Adaptive Attack Strategy Construction, which develops transferable adversarial strategies for prompt optimization, and (2) Attack Enhancement, which identifies stealthy tools capable of circumventing task-relevance defenses. Comprehensive experimental evaluation shows that AdapTools achieves a 2.13 times improvement in attack success rate while degrading system utility by a factor of 1.78. Notably, the framework maintains its effectiveness even against state-of-the-art defense mechanisms. Our method advances the understanding of IPI attacks and provides a useful reference for future research.

πŸ” Key Points

  • AdapTools introduces an adaptive framework for Indirect Prompt Injection (IPI) attacks, overcoming the limitations of existing methods that rely on static patterns and target simpler language models.
  • The framework includes two main components: Adaptive Attack Strategy Construction for prompt optimization, and Attack Enhancement for stealthy tool selection, making attacks more effective and harder to detect.
  • Comprehensive experiments show that AdapTools achieves over a twofold improvement in attack success rates while significantly degrading system utility, demonstrating its effectiveness against modern defense mechanisms.
  • The paper presents a new dataset, IPI-3K, that includes diverse and realistic scenarios for evaluating IPI attacks, highlighting the need for better tailored attack strategies against reasoning LLMs.
  • AdapTools reveals critical vulnerabilities in LLM-based agents that utilize external data services, emphasizing the urgent need for enhanced security measures in AI systems.

πŸ’‘ Why This Paper Matters

The findings from this research fill a significant gap in understanding and combating IPI vulnerabilities within advanced AI agent systems. By developing AdapTools, the authors not only provide a novel methodology for executing adaptive IPI attacks but also urge the academic community to prioritize the robust defense of LLM agents. This work is crucial for creating a more secure AI ecosystem, especially as these models become more integrated into sensitive and complex operational contexts.

🎯 Why It's Interesting for AI Security Researchers

This paper is essential for AI security researchers as it addresses a pressing concern in the fieldβ€”securing large language model (LLM) agents from advanced adversarial attacks. It challenges existing methodologies by presenting adaptive strategies that can outsmart current defenses, thus informing researchers about the evolving landscape of AI security threats. Furthermore, the introduction of IPI-3K as a benchmark sets the stage for future studies to develop more resilient systems against such adaptive vulnerabilities.

πŸ“š Read the Full Paper