How Secure is Secure Code Generation? Adversarial Prompts Put LLM Defenses to the Test

Authors: Melissa Tessa, Iyiola E. Olatunji, Aicha War, Jacques Klein, Tegawendé F. Bissyandé

Published: 2026-01-11

arXiv ID: 2601.07084v1

Added to Library: 2026-01-13 03:01 UTC

Safety

📄 Abstract

Recent secure code generation methods, using vulnerability-aware fine-tuning, prefix-tuning, and prompt optimization, claim to prevent LLMs from producing insecure code. However, their robustness under adversarial conditions remains untested, and current evaluations decouple security from functionality, potentially inflating reported gains. We present the first systematic adversarial audit of state-of-the-art secure code generation methods (SVEN, SafeCoder, PromSec). We subject them to realistic prompt perturbations such as paraphrasing, cue inversion, and context manipulation that developers might inadvertently introduce or adversaries deliberately exploit. To enable fair comparison, we evaluate all methods under consistent conditions, jointly assessing security and functionality using multiple analyzers and executable tests. Our findings reveal critical robustness gaps: static analyzers overestimate security by 7 to 21 times, with 37 to 60% of ``secure'' outputs being non-functional. Under adversarial conditions, true secure-and-functional rates collapse to 3 to 17%. Based on these findings, we propose best practices for building and evaluating robust secure code generation methods. Our code is available.

🔍 Key Points

The paper systematically audits three state-of-the-art secure code generation methods (Sven, SafeCoder, PromSec), revealing significant robustness gaps under adversarial conditions.
It establishes a unified evaluation framework that measures security and functionality jointly, addressing the common decoupling of these two aspects in previous evaluations.
Findings indicate that static analyzers overestimate security by orders of magnitude, with a high percentage of generated 'secure' code being non-functional.
Under adversarial conditions, the true rate of code that is both secure and functional drops dramatically (to between 3% and 17%), exposing critical vulnerabilities in current secure code generation approaches.
The paper proposes actionable best practices for the development and evaluation of secure code generation methods, emphasizing the need for adversarial testing and joint metrics.

💡 Why This Paper Matters

This paper is relevant because it challenges the claimed effectiveness of current secure code generation methods by exposing their weaknesses under realistic attack scenarios. By focusing on robust evaluation practices, it aims to improve the trustworthiness of AI-generated code in critical applications, addressing a significant issue in AI security that affects developers, companies, and end-users alike.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers would be interested in this paper as it highlights the gap between purported security and actual performance in AI code-generation systems. The insights gained from this research could inform future designs of more robust models that are better at recognizing and mitigating vulnerabilities, thereby enhancing the overall security posture of software development practices.

How Secure is Secure Code Generation? Adversarial Prompts Put LLM Defenses to the Test

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper