OpenClaw PRISM: A Zero-Fork, Defense-in-Depth Runtime Security Layer for Tool-Augmented LLM Agents

📄 Abstract

Tool-augmented LLM agents introduce security risks that extend beyond user-input filtering, including indirect prompt injection through fetched content, unsafe tool execution, credential leakage, and tampering with local control files. We present OpenClaw PRISM, a zero-fork runtime security layer for OpenClaw-based agent gateways. PRISM combines an in-process plugin with optional sidecar services and distributes enforcement across ten lifecycle hooks spanning message ingress, prompt construction, tool execution, tool-result persistence, outbound messaging, sub-agent spawning, and gateway startup. Rather than introducing a novel detection model, PRISM integrates a hybrid heuristic-plus-LLM scanning pipeline, conversation- and session-scoped risk accumulation with TTL-based decay, policy-enforced controls over tools, paths, private networks, domain tiers, and outbound secret patterns, and a tamper-evident audit and operations plane with integrity verification and hot-reloadable policy management. We outline an evaluation methodology and benchmark pipeline for measuring security effectiveness, false positives, layer contribution, runtime overhead, and operational recoverability in an agent-runtime setting, and we report current preliminary benchmark results on curated same-slice experiments and operational microbenchmarks. The system targets deployable runtime defense for real agent gateways rather than benchmark-only detection.

🔍 Key Points

Introduction of OpenClaw PRISM, a zero-fork runtime security layer specifically designed for tool-augmented LLM agents, addressing diverse security risks that go beyond traditional input filtering.
The system leverages ten lifecycle hooks for comprehensive security enforcement, ensuring protection during message handling, prompt construction, tool execution, and outbound messaging.
Implementation of a hybrid heuristic-plus-LLM scanning pipeline which allows PRISM to efficiently assess risks and escalate suspicious findings without incurring heavy computational overhead.
Development of a tamper-evident audit and operational plane that integrates integrity verification and hot-reloadable policy management to facilitate real-time operator oversight and adjustments.
Provision of a detailed evaluation methodology that benchmarks security effectiveness, false positives, runtime overhead, and operational recoverability, validated through preliminary experimental results.

💡 Why This Paper Matters

The paper presents a crucial advancement in the field of AI security by providing a modular, deployable runtime solution that integrates multiple layers of defense against the broader security challenges posed by tool-using agents. By addressing not only prompt injection but also tool misuse and credential leakage, OpenClaw PRISM significantly enhances the safety of LLM applications in practical deployments.

🎯 Why It's Interesting for AI Security Researchers

This paper is a valuable resource for AI security researchers as it highlights comprehensive security measures tailored to the unique challenges of LLM agents. The approach of integrating runtime enforcement across lifecycle events presents a paradigm shift from traditional boundary-based defenses, making PRISM an exemplary model for future security architectures in AI applications.

OpenClaw PRISM: A Zero-Fork, Defense-in-Depth Runtime Security Layer for Tool-Augmented LLM Agents

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper