← Back to Library

Blind Gods and Broken Screens: Architecting a Secure, Intent-Centric Mobile Agent Operating System

Authors: Zhenhua Zou, Sheng Guo, Qiuyang Zhan, Lepeng Zhao, Shuo Li, Qi Li, Ke Xu, Mingwei Xu, Zhuotao Liu

Published: 2026-02-11

arXiv ID: 2602.10915v3

Added to Library: 2026-02-16 03:00 UTC

πŸ“„ Abstract

The evolution of Large Language Models (LLMs) has shifted mobile computing from App-centric interactions to system-level autonomous agents. Current implementations predominantly rely on a "Screen-as-Interface" paradigm, which inherits structural vulnerabilities and conflicts with the mobile ecosystem's economic foundations. In this paper, we conduct a systematic security analysis of state-of-the-art mobile agents using Doubao Mobile Assistant as a representative case. We decompose the threat landscape into four dimensions - Agent Identity, External Interface, Internal Reasoning, and Action Execution - revealing critical flaws such as fake App identity, visual spoofing, indirect prompt injection, and unauthorized privilege escalation stemming from a reliance on unstructured visual data. To address these challenges, we propose Aura, an Agent Universal Runtime Architecture for a clean-slate secure agent OS. Aura replaces brittle GUI scraping with a structured, agent-native interaction model. It adopts a Hub-and-Spoke topology where a privileged System Agent orchestrates intent, sandboxed App Agents execute domain-specific tasks, and the Agent Kernel mediates all communication. The Agent Kernel enforces four defense pillars: (i) cryptographic identity binding via a Global Agent Registry; (ii) semantic input sanitization through a multilayer Semantic Firewall; (iii) cognitive integrity via taint-aware memory and plan-trajectory alignment; and (iv) granular access control with non-deniable auditing. Evaluation on MobileSafetyBench shows that, compared to Doubao, Aura improves low-risk Task Success Rate from roughly 75% to 94.3%, reduces high-risk Attack Success Rate from roughly 40% to 4.4%, and achieves near-order-of-magnitude latency gains. These results demonstrate Aura as a viable, secure alternative to the "Screen-as-Interface" paradigm.

πŸ” Key Points

  • Introduction of Context-Conditioned Delta Steering (CC-Delta) that uses Sparse Autoencoders (SAEs) to effectively mitigate jailbreak attacks on large language models.
  • CC-Delta demonstrates improved safety-utility trade-offs when compared to existing baseline defenses operating in dense latent space, outperforming dense activation steering methods especially on out-of-distribution attacks.
  • A novel statistical feature selection method identifies jailbreak-relevant features by analyzing changes in token-level representations without needing model generations for training.
  • Experimental results across multiple instruction-tuned models show that CC-Delta provides superior generalization against unseen jailbreak attack types, indicating more robust defense mechanisms.
  • The research highlights the potential of off-the-shelf SAEs trained for interpretability in practical jailbreak defense applications.

πŸ’‘ Why This Paper Matters

This paper presents a significant advancement in the field of AI safety by introducing a novel method (CC-Delta) for mitigating jailbreak attacks on large language models. The findings indicate that steering mechanisms operating in sparse latent spaces can achieve better performance and more robust defenses compared to traditional methods. Such advancements are crucial as AI models become increasingly adopted in sensitive applications, emphasizing the importance of maintaining safety and utility in their operations.

🎯 Why It's Interesting for AI Security Researchers

This paper is of great interest to AI security researchers because it addresses a critical vulnerability in large language modelsβ€”the potential for malicious jailbreak attacks. By proposing a novel defense mechanism using SAEs, it provides a new approach to enhance the safety of AI systems in real-world applications. The implications of better safety-utility trade-offs are vital for researchers who are looking to develop more resilient AI systems and to explore the effectiveness of sparse representations in adversarial conditions.

πŸ“š Read the Full Paper