← Back to Library

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Authors: Yulin Shen, Xudong Pan, Geng Hong, Min Yang

Published: 2026-03-25

arXiv ID: 2603.24203v1

Added to Library: 2026-03-26 02:01 UTC

Red Teaming

📄 Abstract

Recent advances in the Model Context Protocol (MCP) have enabled large language models (LLMs) to invoke external tools with unprecedented ease. This creates a new class of powerful and tool augmented agents. Unfortunately, this capability also introduces an under explored attack surface, specifically the malicious manipulation of tool responses. Existing techniques for indirect prompt injection that target MCP suffer from high deployment costs, weak semantic coherence, or heavy white box requirements. Furthermore, they are often easily detected by recently proposed defenses. In this paper, we propose Tree structured Injection for Payloads (TIP), a novel black-box attack which generates natural payloads to reliably seize control of MCP enabled agents even under defense. Technically, We cast payload generation as a tree structured search problem and guide the search with an attacker LLM operating under our proposed coarse-to-fine optimization framework. To stabilize learning and avoid local optima, we introduce a path-aware feedback mechanism that surfaces only high quality historical trajectories to the attacker model. The framework is further hardened against defensive transformations by explicitly conditioning the search on observable defense signals and dynamically reallocating the exploration budget. Extensive experiments on four mainstream LLMs show that TIP attains over 95% attack success in undefended settings while requiring an order of magnitude fewer queries than prior adaptive attacks. Against four representative defense approaches, TIP preserves more than 50% effectiveness and significantly outperforms the state-of-the-art attacks. By implementing the attack on real world MCP systems, our results expose an invisible but practical threat vector in MCP deployments. We also discuss potential mitigation approaches to address this critical security gap.

🔍 Key Points

  • The paper introduces the Tree-structured Injection for Payloads (TIP) framework, a novel black-box attack method designed to exploit the Model Context Protocol (MCP) in large language models (LLMs).
  • TIP generates stealthy injection payloads through a tree-structured adaptive search method that prioritizes semantic coherence and adversarial effectiveness, allowing it to surpass existing techniques in attack success rates and efficiency.
  • Extensive experiments demonstrate that TIP achieves over 95% attack success rates in undefended settings and maintains over 50% effectiveness against state-of-the-art defense mechanisms, highlighting a significant vulnerability in the MCP ecosystem.
  • The authors emphasize the practical implications of their findings, exposing real-world security risks associated with tool-augmented LLMs and stressing the urgent need for improved defenses against such attacks.
  • A case study illustrates the effectiveness of TIP in real-world scenarios, showcasing its ability to exploit the inherent trust placed in third-party tools and the consequences of such vulnerabilities.

💡 Why This Paper Matters

This paper is highly relevant as it uncovers critical vulnerabilities in the integration of large language models with external tools through the Model Context Protocol. The proposed TIP framework not only highlights the ease with which adversaries can manipulate these systems but also reveals the inadequacies of current defense mechanisms, necessitating urgent improvements in AI security protocols.

🎯 Why It's Interesting for AI Security Researchers

The findings of this study are of significant interest to AI security researchers as they highlight an underexplored attack vector in LLMs, demonstrating the potential for real-world exploitation of tool-augmented systems. The empirical results and advanced methodologies presented in the paper provide valuable insights for developing more robust defenses and understanding adversarial behaviors in AI systems.

📚 Read the Full Paper