← Back to Library

Breaking the Protocol: Security Analysis of the Model Context Protocol Specification and Prompt Injection Vulnerabilities in Tool-Integrated LLM Agents

Authors: Narek Maloyan, Dmitry Namiot

Published: 2026-01-24

arXiv ID: 2601.17549v1

Added to Library: 2026-01-27 03:01 UTC

Red Teaming

πŸ“„ Abstract

The Model Context Protocol (MCP) has emerged as a de facto standard for integrating Large Language Models with external tools, yet no formal security analysis of the protocol specification exists. We present the first rigorous security analysis of MCP's architectural design, identifying three fundamental protocol-level vulnerabilities: (1) absence of capability attestation allowing servers to claim arbitrary permissions, (2) bidirectional sampling without origin authentication enabling server-side prompt injection, and (3) implicit trust propagation in multi-server configurations. We implement \textsc{MCPBench}, a novel framework bridging existing agent security benchmarks to MCP-compliant infrastructure, enabling direct measurement of protocol-specific attack surfaces. Through controlled experiments on 847 attack scenarios across five MCP server implementations, we demonstrate that MCP's architectural choices amplify attack success rates by 23--41\% compared to equivalent non-MCP integrations. We propose \textsc{MCPSec}, a backward-compatible protocol extension adding capability attestation and message authentication, reducing attack success rates from 52.8\% to 12.4\% with median latency overhead of 8.3ms per message. Our findings establish that MCP's security weaknesses are architectural rather than implementation-specific, requiring protocol-level remediation.

πŸ” Key Points

  • First systematic security analysis of the Model Context Protocol (MCP), exposing architectural vulnerabilities including absence of capability attestation, unauthenticated sampling, and implicit trust propagation.
  • Implementation of ProtoAmp, a framework that quantitatively evaluates MCP-integrated systems against non-MCP systems, revealing a significant increase in attack success rates of 23%-41%.
  • Proposal of AttestMCP, a backward-compatible protocol extension introducing capability attestation and message authentication, which reduces attack success rates from 52.8% to 12.4% with a manageable latency overhead.
  • Empirical testing demonstrated that the protocol’s inherent design amplifies existing attacks, necessitating architectural changes rather than just implementation fixes to secure MCP-compliant systems.

πŸ’‘ Why This Paper Matters

This paper is crucial as it highlights significant architectural flaws in the rapidly adopted Model Context Protocol, which underpins the integration of Large Language Models with external tools. By conducting a thorough security analysis and offering a scalable mitigation approach in AttestMCP, it sets the stage for safer integrations in the AI ecosystem, directly impacting trustworthiness in AI applications. The findings emphasize the need for proactive security measures in AI protocols, which can potentially resonate across various applications beyond MCP.

🎯 Why It's Interesting for AI Security Researchers

The paper is of great interest to AI security researchers as it not only identifies and characterizes critical protocol-level vulnerabilities but also presents actionable defenses. By focusing on architectural security rather than solely implementation weaknesses, it tackles a fundamental problem in ensuring safe AI agent behavior in interconnected environments. Researchers in this domain would benefit from understanding how protocol design choices can lead to security vulnerabilities and how appropriate countermeasures can be implemented.

πŸ“š Read the Full Paper