← Back to Library

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

Authors: Yechao Zhang, Shiqian Zhao, Jie Zhang, Gelei Deng, Jiawen Zhang, Xiaogeng Liu, Chaowei Xiao, Tianwei Zhang

Published: 2026-03-24

arXiv ID: 2603.23064v3

Added to Library: 2026-04-07 02:02 UTC

📄 Abstract

We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and subsequently influence user-facing behavior without the user's awareness. This vulnerability arises from an architectural design shared across the Claw ecosystem: heartbeat background execution runs in the same session as user-facing conversation, so content ingested from any external source monitored in the background (including email, message channels, news feeds, code repositories, and social platforms) can enter the same memory context used for foreground interaction, often with limited user visibility and without clear source provenance. We formalize this process as an Exposure (E) $\rightarrow$ Memory (M) $\rightarrow$ Behavior (B) pathway: misinformation encountered during heartbeat execution enters the agent's short-term session context, potentially gets written into long-term memory, and later shapes downstream user-facing behavior. We instantiate this pathway in an agent-native social setting using MissClaw, a controlled research replica of Moltbook. We find that (1) social credibility cues, especially perceived consensus, are the dominant driver of short-term behavioral influence, with misleading rates up to 61%; (2) routine memory-saving behavior can promote short-term pollution into durable long-term memory at rates up to 91%, with cross-session behavioral influence reaching 76%; (3) under naturalistic browsing with content dilution and context pruning, pollution still crosses session boundaries. Overall, prompt injection is not required: ordinary social misinformation is sufficient to silently shape agent memory and behavior under heartbeat-driven background execution.

🔍 Key Points

  • Introduction of ClawSafety, a benchmark with 120 adversarial test scenarios for evaluating AI agent safety in real-world professional contexts.
  • Analysis of attack success rates (ASR) among five frontier LLMs, revealing a stark variance in safety depending on the model and scaffolding framework used.
  • Identification of a trust-level gradient across different attack vectors (skill file injection, email injection, web content), emphasizing the efficacy of operational specificity over authority in malicious instructions.
  • Cross-scaffold analysis indicating that safety is a property of the complete deployment stack (model and framework), rather than the model alone, promoting comprehensive safety assessments.
  • Exploration of qualitative case studies that reveal specific mechanisms and vulnerabilities not captured by aggregate metrics.

💡 Why This Paper Matters

The paper introduces ClawSafety, a crucial advancement in benchmarking the safety of personal AI agents in high-stakes environments, addressing the urgent need for rigorous evaluation frameworks that account for both the models and their deployment contexts. By illustrating how adversarial threats manifest in realistic scenarios, this work lays the groundwork for safer AI applications in sensitive domains like finance and healthcare, ultimately contributing to more robust AI safety standards.

🎯 Why It's Interesting for AI Security Researchers

This paper is highly relevant to AI security researchers as it provides a novel framework for evaluating the safety of AI systems in realistic scenarios, identifying vulnerabilities specific to personal AI agents. Its empirical findings on attack vectors and model vulnerabilities not only deepen the understanding of LLM safety but also inform the development of more resilient AI systems by outlining key mechanisms that can be exploited. Furthermore, the introduction of a detailed threat taxonomy and comprehensive evaluation methodology presents a valuable tool for ongoing safety research and development.

📚 Read the Full Paper