Prompt Injection Attack to Tool Selection in LLM Agents

Authors: Jiawen Shi, Zenghui Yuan, Guiyao Tie, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun

Published: 2025-04-28

arXiv ID: 2504.19793v3

Added to Library: 2025-11-11 14:19 UTC

Red Teaming

📄 Abstract

Tool selection is a key component of LLM agents. A popular approach follows a two-step process - \emph{retrieval} and \emph{selection} - to pick the most appropriate tool from a tool library for a given task. In this work, we introduce \textit{ToolHijacker}, a novel prompt injection attack targeting tool selection in no-box scenarios. ToolHijacker injects a malicious tool document into the tool library to manipulate the LLM agent's tool selection process, compelling it to consistently choose the attacker's malicious tool for an attacker-chosen target task. Specifically, we formulate the crafting of such tool documents as an optimization problem and propose a two-phase optimization strategy to solve it. Our extensive experimental evaluation shows that ToolHijacker is highly effective, significantly outperforming existing manual-based and automated prompt injection attacks when applied to tool selection. Moreover, we explore various defenses, including prevention-based defenses (StruQ and SecAlign) and detection-based defenses (known-answer detection, DataSentinel, perplexity detection, and perplexity windowed detection). Our experimental results indicate that these defenses are insufficient, highlighting the urgent need for developing new defense strategies.

🔍 Key Points

Introduction of ToolHijacker, a prompt injection attack specifically targeting tool selection in LLM agents, which is the first of its kind.
Formulation of the attack as an optimization problem that involves a two-phase optimization strategy to craft malicious tool documents effectively.
Extensive experimental evaluation demonstrating ToolHijacker's effectiveness in outperforming existing manual and automated prompt injection attacks on multiple LLMs and datasets.
Investigation of various prevention-based and detection-based defenses, revealing their inadequacies against ToolHijacker, thus underscoring the urgent need for improved defense mechanisms.
The paper provides systematic insights and baseline performance metrics that can guide future research in AI security and tool selection processes.

💡 Why This Paper Matters

The paper is significant as it identifies a crucial vulnerability in LLM agents' tool selection processes. By developing ToolHijacker, the authors not only showcase a novel attack method but also highlight the shortcomings of current defenses, prompting further investigations and innovations in AI security. The findings stress the need for robust frameworks to safeguard against such sophisticated attacks, making it a valuable resource for both researchers and practitioners in the field.

🎯 Why It's Interesting for AI Security Researchers

This paper would be of great interest to AI security researchers as it provides a detailed examination of a novel attack vector that exploits an existing vulnerability in LLMs. The insights gained from the experiments conducted can serve as a foundation for developing more secure LLM systems. Furthermore, the analysis of defense mechanisms against ToolHijacker highlights significant gaps in current security protocols, motivating researchers to innovate new solutions to protect against prompt injection attacks.

Prompt Injection Attack to Tool Selection in LLM Agents

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper