← Back to Library

SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy

Authors: Ali Dehghantanha, Sajad Homayoun

Published: 2026-03-24

arXiv ID: 2603.22928v1

Added to Library: 2026-03-25 02:00 UTC

Red Teaming

📄 Abstract

Recent AI systems combine large language models with tools, external knowledge via retrieval-augmented generation (RAG), and even autonomous multi-agent decision loops. This agentic AI paradigm greatly expands capabilities - but also vastly enlarges the attack surface. In this systematization, we map out the trust boundaries and security risks of agentic LLM-based systems. We develop a comprehensive taxonomy of attacks spanning prompt-level injections, knowledge-base poisoning, tool/plug-in exploits, and multi-agent emergent threats. Through a detailed literature review, we synthesize evidence from 2023-2025, including more than 20 peer-reviewed and archival studies, industry reports, and standards. We find that agentic systems introduce new vectors for indirect prompt injection, code execution exploits, RAG index poisoning, and cross-agent manipulation that go beyond traditional AI threats. We define attacker models and threat scenarios, and propose metrics (e.g., Unsafe Action Rate, Privilege Escalation Distance) to evaluate security posture. Our survey examines defenses such as input sanitization, retrieval filters, sandboxes, access control, and "AI guardrails," assessing their effectiveness and pointing out the areas where protection is still lacking. To assist practitioners, we outline defensive controls and provide a phased security checklist for deploying agentic AI (covering design-time hardening, runtime monitoring, and incident response). Finally, we outline open research challenges in secure autonomous AI (robust tool APIs, verifiable agent behavior, supply-chain safeguards) and discuss ethical and responsible disclosure practices. We systematize recent findings to help researchers and engineers understand and mitigate security risks in agentic AI.

🔍 Key Points

  • Develops a comprehensive taxonomy of attacks specific to agentic AI systems, identifying attack goals and vectors related to prompt injection, RAG, tool misuse, and multi-agent threats.
  • Introduces an attacker model along with threat scenarios, along with metrics such as Unsafe Action Rate (UAR) and Privilege Escalation Distance (PED) to evaluate security postures.
  • Conducts a detailed literature synthesis from 2023-2025, assessing over 20 studies and summarizing the effectiveness of current defenses such as input sanitization, retrieval filters, and sandboxing.
  • Proposes a phased security checklist for deploying agentic AI, focusing on design-time hardening, runtime monitoring, and incident response strategies.
  • Identifies critical open research challenges in securing autonomous AI systems, including formal methods for agent behavior verification and robust tool API management.

💡 Why This Paper Matters

This paper presents a critical exploration of the attack surface associated with agentic AI systems, highlighting the unique vulnerabilities that arise from their complexity and autonomy. By systematizing the security risks and providing a framework for understanding these threats, this work significantly contributes to the field of AI security, offering guidance for both researchers and practitioners.

🎯 Why It's Interesting for AI Security Researchers

The findings and methodologies outlined in this paper are crucial for AI security researchers as they address the emerging threats posed by advanced AI systems. With the increasing integration of AI into various applications, understanding the multifaceted attack surfaces and developing effective defense mechanisms are of paramount importance. Researchers can leverage the insights provided to enhance the security framework of AI systems and drive future studies in this rapidly evolving field.

📚 Read the Full Paper