← Back to Library

SHIELD: An Auto-Healing Agentic Defense Framework for LLM Resource Exhaustion Attacks

Authors: Nirhoshan Sivaroopan, Kanchana Thilakarathna, Albert Zomaya, Manu, Yi Guo, Jo Plested, Tim Lynar, Jack Yang, Wangli Yang

Published: 2026-01-27

arXiv ID: 2601.19174v1

Added to Library: 2026-01-28 03:01 UTC

Safety

πŸ“„ Abstract

Sponge attacks increasingly threaten LLM systems by inducing excessive computation and DoS. Existing defenses either rely on statistical filters that fail on semantically meaningful attacks or use static LLM-based detectors that struggle to adapt as attack strategies evolve. We introduce SHIELD, a multi-agent, auto-healing defense framework centered on a three-stage Defense Agent that integrates semantic similarity retrieval, pattern matching, and LLM-based reasoning. Two auxiliary agents, a Knowledge Updating Agent and a Prompt Optimization Agent, form a closed self-healing loop, when an attack bypasses detection, the system updates an evolving knowledgebase, and refines defense instructions. Extensive experiments show that SHIELD consistently outperforms perplexity-based and standalone LLM defenses, achieving high F1 scores across both non-semantic and semantic sponge attacks, demonstrating the effectiveness of agentic self-healing against evolving resource-exhaustion threats.

πŸ” Key Points

  • Introduction of SHIELD, the first auto-healing defense framework specifically designed to combat sponge attacks targeting LLMs, demonstrating robust adaptability to evolving attack strategies.
  • Development of a three-stage Defense Agent pipeline that focuses on semantic similarity filtering, substring matching, and LLM-based reasoning for prompt classification, enhancing detection accuracy across both non-semantic and semantic attacks.
  • Implementation of two auxiliary agentsβ€”a Knowledge Updating Agent (KUA) for evolving knowledgebase updates in response to new attacks, and a Prompt Optimization Agent (POA) for refining defensive prompts based on real-time attack information, enabling continuous learning without the need for model retraining.
  • Extensive experimental results showing that SHIELD significantly outperforms existing defense mechanisms, achieving high F1 scores and demonstrating effective mitigation of denial-of-service effects in high-throughput environments.
  • Identification of key design challenges such as the trade-off between detection coverage and user experience, knowledgebase growth management, and the robustness of the defense model itself, setting the stage for future research directions.

πŸ’‘ Why This Paper Matters

SHIELD represents a significant advancement in the defense of large language models against sophisticated resource-exhaustion attacks. It addresses critical gaps in existing defenses by offering a dynamically adapting, multi-agent framework that not only detects but also evolves with emerging threats. The empirical results affirm SHIELD's practical implications for maintaining the integrity and efficiency of LLM operations in real-world applications, making it invaluable for deployment in mission-critical environments where reliability is paramount.

🎯 Why It's Interesting for AI Security Researchers

This paper is of great interest to AI security researchers as it tackles emerging vulnerabilities in large language models, which are increasingly deployed in sensitive applications. The innovative concepts of auto-healing defenses and continuous adaptation in SHIELD provide a new paradigm for developing resilient AI systems. Furthermore, the challenges and findings discussed in the paper inform ongoing efforts to enhance model robustness and security, making it a critical read for researchers focused on safeguarding AI applications against evolving adversarial tactics.

πŸ“š Read the Full Paper