← Back to Library

Beyond Model Jailbreak: Systematic Dissection of the "Ten DeadlySins" in Embodied Intelligence

Authors: Yuhang Huang, Junchao Li, Boyang Ma, Xuelong Dai, Minghui Xu, Kaidi Xu, Yue Zhang, Jianping Wang, Xiuzhen Cheng

Published: 2025-12-06

arXiv ID: 2512.06387v1

Added to Library: 2025-12-09 03:03 UTC

📄 Abstract

Embodied AI systems integrate language models with real world sensing, mobility, and cloud connected mobile apps. Yet while model jailbreaks have drawn significant attention, the broader system stack of embodied intelligence remains largely unexplored. In this work, we conduct the first holistic security analysis of the Unitree Go2 platform and uncover ten cross layer vulnerabilities the "Ten Sins of Embodied AI Security." Using BLE sniffing, traffic interception, APK reverse engineering, cloud API testing, and hardware probing, we identify systemic weaknesses across three architectural layers: wireless provisioning, core modules, and external interfaces. These include hard coded keys, predictable handshake tokens, WiFi credential leakage, missing TLS validation, static SSH password, multilingual safety bypass behavior, insecure local relay channels, weak binding logic, and unrestricted firmware access. Together, they allow adversaries to hijack devices, inject arbitrary commands, extract sensitive information, or gain full physical control.Our findings show that securing embodied AI requires far more than aligning the model itself. We conclude with system level lessons learned and recommendations for building embodied platforms that remain robust across their entire software hardware ecosystem.

🔍 Key Points

  • Introduction of three classes of semantic attacks on MCP-integrated systems: Tool Poisoning, Shadowing, and Rug Pulls.
  • Proposition of a layered security framework that includes RSA-based manifest signing, LLM-on-LLM semantic vetting, and lightweight heuristic guardrails.
  • Extensive evaluation of the proposed defenses across multiple LLMs (GPT-4, DeepSeek, Llama-3.5), showing security performance variations and trade-offs in latency.
  • Findings indicate that structured prompting techniques enhance safety but can also increase execution latency, emphasizing the safety-latency trade-off in LLM security.
  • Demonstration of the limitations of current defenses in addressing the new class of threats introduced by the Model Context Protocol.

💡 Why This Paper Matters

This paper provides critical insights into the security vulnerabilities of Large Language Models (LLMs) that utilize the Model Context Protocol (MCP) for tool integration, a largely overlooked area in AI security. By establishing specific attack vectors and proposing a novel layered defense approach, it addresses significant concerns about the autonomy and potential misuse of LLMs in real-world applications. The empirical findings highlight the urgent need for effective security measures in LLM deployments, making this research relevant and timely as LLMs are increasingly adopted across various domains.

🎯 Why It's Interesting for AI Security Researchers

The paper is of great interest to AI security researchers as it tackles emerging threats posed by adversarial manipulation of LLMs through the Model Context Protocol. It introduces new attack classes that exploit the interconnectedness and semantic interpretability of tool descriptors, a unique contribution that expands the understanding of potential vulnerabilities in agentic AI systems. The layered defense mechanisms proposed offer practical insights for improving model security against these advanced threats, paving the way for future research and development in LLM safety protocols.

📚 Read the Full Paper