← Back to Library

Measuring the Vulnerability Disclosure Policies of AI Vendors

Authors: Yangheran Piao, Jingjie Li, Daniel W. Woods

Published: 2025-09-07

arXiv ID: 2509.06136v1

Added to Library: 2025-09-09 04:00 UTC

📄 Abstract

As AI is increasingly integrated into products and critical systems, researchers are paying greater attention to identifying related vulnerabilities. Effective remediation depends on whether vendors are willing to accept and respond to AI vulnerability reports. In this paper, we examine the disclosure policies of 264 AI vendors. Using a mixed-methods approach, our quantitative analysis finds that 36% of vendors provide no disclosure channel, and only 18% explicitly mention AI-related risks. Vulnerabilities involving data access, authorization, and model extraction are generally considered in-scope, while jailbreaking and hallucination are frequently excluded. Through qualitative analysis, we further identify three vendor postures toward AI vulnerabilities - proactive clarification (n = 46, include active supporters, AI integrationists, and back channels), silence (n = 115, include self-hosted and hosted vendors), and restrictive (n = 103). Finally, by comparing vendor policies against 1,130 AI incidents and 359 academic publications, we show that bug bounty policy evolution has lagged behind both academic research and real-world events.

🔍 Key Points

  • Introduction of Mask-GCG: A method employing learnable token masking that identifies and prunes low-impact tokens in adversarial suffixes, improving the efficiency of jailbreak attacks on LLMs.
  • Demonstrated significant token redundancy in fixed-length suffixes generated by existing methods like Greedy Coordinate Gradient (GCG), revealing that not all tokens contribute equally to attack success.
  • Empirical results show that Mask-GCG maintains or improves Attack Success Rates (ASR) while achieving substantial reductions in computational resources and suffix lengths, confirming the validity of the proposed pruning approach.
  • The approach highlights a hierarchy of token importance within adversarial prompts, indicating that over 83% of tokens can be classified as high-impact without jeopardizing the effectiveness of the attack.
  • The findings contribute to the development of more interpretable and efficient large language models by illuminating how adversarial prompts can be optimized.

💡 Why This Paper Matters

The paper presents a significant advancement in the security analysis of large language models by revealing token redundancy within adversarial suffixes used in jailbreak attacks and proposing a novel optimization technique, Mask-GCG. This contributes to a deeper understanding of the vulnerabilities of LLMs and offers practical solutions for improving attack efficiency. Moreover, Mask-GCG's ability to streamline computational processes while retaining or improving attack effectiveness makes it a useful tool for researchers in AI security.

🎯 Why It's Interesting for AI Security Researchers

This paper is highly relevant for AI security researchers interested in understanding the vulnerabilities of large language models, particularly in the context of adversarial attacks. By demonstrating the potential for optimizing adversarial prompts through the use of learnable token masking, researchers can explore more effective and efficient methods for enhancing model safety and security. Additionally, the insights gained from the study could inform the development of better defenses against jailbreak attacks, ultimately contributing to the creation of more robust AI systems.

📚 Read the Full Paper