← Back to Library

Adversarial Attacks and Defenses on Graph-aware Large Language Models (LLMs)

Authors: Iyiola E. Olatunji, Franziska Boenisch, Jing Xu, Adam Dziedzic

Published: 2025-08-06

arXiv ID: 2508.04894v1

Added to Library: 2025-08-14 23:05 UTC

Safety

📄 Abstract

Large Language Models (LLMs) are increasingly integrated with graph-structured data for tasks like node classification, a domain traditionally dominated by Graph Neural Networks (GNNs). While this integration leverages rich relational information to improve task performance, their robustness against adversarial attacks remains unexplored. We take the first step to explore the vulnerabilities of graph-aware LLMs by leveraging existing adversarial attack methods tailored for graph-based models, including those for poisoning (training-time attacks) and evasion (test-time attacks), on two representative models, LLAGA (Chen et al. 2024) and GRAPHPROMPTER (Liu et al. 2024). Additionally, we discover a new attack surface for LLAGA where an attacker can inject malicious nodes as placeholders into the node sequence template to severely degrade its performance. Our systematic analysis reveals that certain design choices in graph encoding can enhance attack success, with specific findings that: (1) the node sequence template in LLAGA increases its vulnerability; (2) the GNN encoder used in GRAPHPROMPTER demonstrates greater robustness; and (3) both approaches remain susceptible to imperceptible feature perturbation attacks. Finally, we propose an end-to-end defense framework GALGUARD, that combines an LLM-based feature correction module to mitigate feature-level perturbations and adapted GNN defenses to protect against structural attacks.

🔍 Key Points

  • The paper uncovers substantial vulnerabilities of graph-aware large language models (LLMs) to adversarial attacks, specifically identifying attack surfaces such as node sequence template injection.
  • A detailed taxonomy of graph encoding methods for both LLMs and graph neural networks (GNNs) is presented, categorizing adaptations based on textual descriptions and learned projectors.
  • The authors demonstrate that feature perturbation attacks are particularly effective against graph-aware LLMs, often outperforming traditional structural attacks that target GNNs.
  • The paper introduces a novel end-to-end defense framework called GaLGuard, which integrates features from GNN defenses to improve robustness against both structural and feature-level adversarial attacks.
  • Experimental results highlight the importance of the choice of graph-aware paradigm, with the findings suggesting that LLaGA is more susceptible to attacks compared to GraphPrompter due to architectural vulnerabilities.

💡 Why This Paper Matters

This paper is crucial as it opens a new avenue of research into the vulnerabilities of graph-aware LLMs, providing essential insights for improving their adversarial robustness. By systematically exploring the integration of graph structures into LLMs and the implications of various encoding schemes, it sets a foundation for future work aimed at enhancing the security and stability of these advanced models in real-world applications.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers will find this paper particularly relevant as it delves into the emerging intersection of natural language processing and graph-based models, highlighting specific vulnerabilities and attack methods unique to this paradigm. It also provides a robust defense strategy, offering practical implications for developing more resilient AI systems against adversarial threats, which is a growing concern in the field.

📚 Read the Full Paper