WebWeaver: Breaking Topology Confidentiality in LLM Multi-Agent Systems with Stealthy Context-Based Inference

📄 Abstract

Communication topology is a critical factor in the utility and safety of LLM-based multi-agent systems (LLM-MAS), making it a high-value intellectual property (IP) whose confidentiality remains insufficiently studied. % Existing topology inference attempts rely on impractical assumptions, including control over the administrative agent and direct identity queries via jailbreaks, which are easily defeated by basic keyword-based defenses. As a result, prior analyses fail to capture the real-world threat of such attacks. % To bridge this realism gap, we propose \textit{WebWeaver}, an attack framework that infers the complete LLM-MAS topology by compromising only a single arbitrary agent instead of the administrative agent. % Unlike prior approaches, WebWeaver relies solely on agent contexts rather than agent IDs, enabling significantly stealthier inference. % WebWeaver further introduces a new covert jailbreak-based mechanism and a novel fully jailbreak-free diffusion design to handle cases where jailbreaks fail. % Additionally, we address a key challenge in diffusion-based inference by proposing a masking strategy that preserves known topology during diffusion, with theoretical guarantees of correctness. % Extensive experiments show that WebWeaver substantially outperforms state-of-the-art (SOTA) baselines, achieving about 60\% higher inference accuracy under active defenses with negligible overhead.

🔍 Key Points

WebWeaver framework allows inference of complete LLM-MAS topology through a single compromised agent, enhancing realism in attack modeling.
Introduced covert recursive jailbreak mechanisms and a fully jailbreak-free diffusion design to ensure topology inference even when traditional methods fail.
Implemented a masking strategy in the diffusion module to preserve known topology during inference, which provides theoretical correctness guarantees.
Extensive experiments show WebWeaver outperforms existing state-of-the-art (SOTA) methods with a 60% increase in inference accuracy under active defense measures.
Developed a new dialogue dataset with annotated topology and agent prompts to support both current and future research in multi-agent systems security.

💡 Why This Paper Matters

The WebWeaver framework represents a significant advancement in the security of large language model-based multi-agent systems by addressing the critical issue of topology confidentiality. By providing a practical and effective method to infer such topologies, the paper highlights vulnerabilities in existing assumptions surrounding multi-agent system security, thus paving the way for more robust defensive strategies. The empirical results demonstrate real-world applicability and suggest a shift in how security assessments should be approached in collaborative LLM settings.

🎯 Why It's Interesting for AI Security Researchers

This paper would be of great interest to AI security researchers as it uncovers significant vulnerabilities in LLM-MAS regarding topology confidentiality, an often-overlooked aspect of AI system security. The novel methodologies presented—particularly the use of context-based inferences and adaptive jailbreak mechanisms—challenge existing defenses and prompt the need for enhanced protective measures. Additionally, the findings highlight the risks associated with real-world applications of LLMs in collaborative environments, making it a critical read for anyone focused on ensuring AI system integrity.

WebWeaver: Breaking Topology Confidentiality in LLM Multi-Agent Systems with Stealthy Context-Based Inference

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper