← Back to Library

SMCP: Secure Model Context Protocol

Authors: Xinyi Hou, Shenao Wang, Yifan Zhang, Ziluo Xue, Yanjie Zhao, Cai Fu, Haoyu Wang

Published: 2026-02-01

arXiv ID: 2602.01129v1

Added to Library: 2026-02-03 08:03 UTC

📄 Abstract

Agentic AI systems built around large language models (LLMs) are moving away from closed, single-model frameworks and toward open ecosystems that connect a variety of agents, external tools, and resources. The Model Context Protocol (MCP) has emerged as a standard to unify tool access, allowing agents to discover, invoke, and coordinate with tools more flexibly. However, as MCP becomes more widely adopted, it also brings a new set of security and privacy challenges. These include risks such as unauthorized access, tool poisoning, prompt injection, privilege escalation, and supply chain attacks, any of which can impact different parts of the protocol workflow. While recent research has examined possible attack surfaces and suggested targeted countermeasures, there is still a lack of systematic, protocol-level security improvements for MCP. To address this, we introduce the Secure Model Context Protocol (SMCP), which builds on MCP by adding unified identity management, robust mutual authentication, ongoing security context propagation, fine-grained policy enforcement, and comprehensive audit logging. In this paper, we present the main components of SMCP, explain how it helps reduce security risks, and illustrate its application with practical examples. We hope that this work will contribute to the development of agentic systems that are not only powerful and adaptable, but also secure and dependable.

🔍 Key Points

  • Introduction of the Provable Defense Framework that leverages Certified Semantic Smoothing (CSS) to provide rigorous safety guarantees against jailbreaking attacks on LLMs.
  • Development of Noise-Augmented Alignment Tuning (NAAT) to address performance degradation while ensuring security, effectively transforming LLMs into semantic denoising models.
  • Empirical results demonstrate a drastic reduction in the Attack Success Rate from 84.2% to 1.2% while maintaining a high benign utility rate of 94.1% on the Llama-3 model, significantly outperforming existing defenses.
  • Establishment of a certified radius based on the Hypergeometric distribution for discrete token substitutions, effectively correcting misapplied scaling laws found in prior heuristic defenses.
  • Methodological advancement through stratified randomized ablation, preserving the structural integrity of inputs and enabling effective adversarial robustness against multiple attack variants.

💡 Why This Paper Matters

The paper presents a foundational approach to enhancing the security of Large Language Models (LLMs) against adaptive adversarial attacks. By combining certified robustness with effective tuning methods, it offers a significant leap forward in creating invulnerable LLMs, crucial for real-world applications where safety is paramount.

🎯 Why It's Interesting for AI Security Researchers

This research is highly relevant for AI security researchers as it tackles the pressing challenge of adversarial attacks on LLMs, a growing concern in AI deployment. The novel certification methods and empirical results provide a benchmark for future studies aimed at improving LLM safety without sacrificing performance, making it a significant contribution to the field of AI security.

📚 Read the Full Paper