← Back to Library

Phishing Email Detection Using Large Language Models

Authors: Najmul Hasan, Prashanth BusiReddyGari, Haitao Zhao, Yihao Ren, Jinsheng Xu, Shaohu Zhang

Published: 2025-12-10

arXiv ID: 2512.10104v2

Added to Library: 2026-01-07 10:13 UTC

Red Teaming

📄 Abstract

Email phishing is one of the most prevalent and globally consequential vectors of cyber intrusion. As systems increasingly deploy Large Language Models (LLMs) applications, these systems face evolving phishing email threats that exploit their fundamental architectures. Current LLMs require substantial hardening before deployment in email security systems, particularly against coordinated multi-vector attacks that exploit architectural vulnerabilities. This paper proposes LLMPEA, an LLM-based framework to detect phishing email attacks across multiple attack vectors, including prompt injection, text refinement, and multilingual attacks. We evaluate three frontier LLMs (e.g., GPT-4o, Claude Sonnet 4, and Grok-3) and comprehensive prompting design to assess their feasibility, robustness, and limitations against phishing email attacks. Our empirical analysis reveals that LLMs can detect the phishing email over 90% accuracy while we also highlight that LLM-based phishing email detection systems could be exploited by adversarial attack, prompt injection, and multilingual attacks. Our findings provide critical insights for LLM-based phishing detection in real-world settings where attackers exploit multiple vulnerabilities in combination.

🔍 Key Points

  • Proposes LLM-PEA, a comprehensive framework to evaluate phishing detection robustness of LLMs against multiple attack vectors including prompt injection and multilingual attacks.
  • Demonstrates that frontier LLMs like GPT-4o, Claude Sonnet 4, and Grok-3 can achieve over 90% accuracy in detecting phishing emails, while exposing vulnerabilities to adversarial manipulation and prompt injections.
  • Evaluates the performance under realistic conditions with imbalanced datasets that reflect actual email traffic, revealing significant issues with model robustness in multilingual contexts.
  • Highlights the limitations of existing phishing detection methodologies that often overlook compound vulnerabilities arising from simultaneous attacks in LLM architectures.
  • Provides empirical findings that underline the necessity for hardening LLM-based phishing systems specifically to handle combined attacks and adversarial tactics.

💡 Why This Paper Matters

This paper is relevant as it addresses a critical issue in cybersecurity where phishing attacks increasingly exploit AI systems. By evaluating the robustness and vulnerability of state-of-the-art LLMs in detecting phishing emails, it contributes valuable insights for improving the security of these AI applications, emphasizing the need for enhanced defensive strategies.

🎯 Why It's Interesting for AI Security Researchers

This paper would greatly interest AI security researchers as it details technical vulnerabilities in LLMs surrounding phishing email detection, an area critical to cybersecurity. The unified approach outlined in LLM-PEA highlights the need for comprehensive evaluation methods that take into account various attack vectors, contributing to safer AI deployment in real-world applications.

📚 Read the Full Paper