← Back to Library

FraudShield: Knowledge Graph Empowered Defense for LLMs against Fraud Attacks

Authors: Naen Xu, Jinghuai Zhang, Ping He, Chunyi Zhou, Jun Wang, Zhihui Fu, Tianyu Du, Zhaoxiang Wang, Shouling Ji

Published: 2026-01-30

arXiv ID: 2601.22485v1

Added to Library: 2026-02-03 08:10 UTC

Safety

📄 Abstract

Large language models (LLMs) have been widely integrated into critical automated workflows, including contract review and job application processes. However, LLMs are susceptible to manipulation by fraudulent information, which can lead to harmful outcomes. Although advanced defense methods have been developed to address this issue, they often exhibit limitations in effectiveness, interpretability, and generalizability, particularly when applied to LLM-based applications. To address these challenges, we introduce FraudShield, a novel framework designed to protect LLMs from fraudulent content by leveraging a comprehensive analysis of fraud tactics. Specifically, FraudShield constructs and refines a fraud tactic-keyword knowledge graph to capture high-confidence associations between suspicious text and fraud techniques. The structured knowledge graph augments the original input by highlighting keywords and providing supporting evidence, guiding the LLM toward more secure responses. Extensive experiments show that FraudShield consistently outperforms state-of-the-art defenses across four mainstream LLMs and five representative fraud types, while also offering interpretable clues for the model's generations.

🔍 Key Points

  • Introduction of FraudShield, a framework utilizing a knowledge graph for identifying and addressing fraud tactics in the context of LLMs.
  • FraudShield employs a two-stage process to detect keywords associated with fraudulent tactics and provides interpretative context to guide LLM responses.
  • Extensive evaluations demonstrate that FraudShield significantly outperforms existing defenses across multiple LLMs and fraud types, showing improvements in detection rates.
  • The framework enhances human users' understanding of fraudulent materials through explicit highlighting and evidence-based reasoning.
  • A user study indicates that FraudShield not only bolsters LLM defenses but also increases user awareness and alertness towards fraud.

💡 Why This Paper Matters

The paper presents FraudShield, a compelling framework that effectively improves the resilience of large language models against fraud. By leveraging knowledge graphs and focusing on interpretability, it represents a significant advancement in AI security, addressing both the technical and societal challenges posed by fraudulent content. The practical implications of this research are critical in developing robust, trustworthy AI systems essential for security in automated decision-making processes.

🎯 Why It's Interesting for AI Security Researchers

This paper is highly relevant for AI security researchers given its novel approach to enhancing the security of large language models against evolving fraud tactics. The use of knowledge graphs for fraud detection presents a new paradigm in securing AI applications, making it a pivotal study for those working on the intersection of AI, security, and interpretability.

📚 Read the Full Paper