← Back to Library

Accelerating Suffix Jailbreak attacks with Prefix-Shared KV-cache

Authors: Xinhai Wang, Shaopeng Fu, Shu Yang, Liangyu Wang, Tianhang Zheng, Di Wang

Published: 2026-03-12

arXiv ID: 2603.13420v1

Added to Library: 2026-03-17 02:01 UTC

Red Teaming

📄 Abstract

Suffix jailbreak attacks serve as a systematic method for red-teaming Large Language Models (LLMs) but suffer from prohibitive computational costs, as a large number of candidate suffixes need to be evaluated before identifying a jailbreak suffix. This paper presents Prefix-Shared KV Cache (PSKV), a plug-and-play inference optimization technique tailored for jailbreak suffix generation. Our method is motivated by a key observation that when performing suffix jailbreaking, while a large number of candidate prompts need to be evaluated, they share the same targeted harmful instruction as the prefix. Therefore, instead of performing redundant inference on the duplicated prefix, PSKV maintains a single KV cache for this prefix and shares it with every candidate prompt, enabling the parallel inference of diverse suffixes with minimal memory overhead. This design enables more aggressive batching strategies that would otherwise be limited by memory constraints. Extensive experiments on six widely used suffix attacks across five widely deployed LLMs demonstrate that PSKV reduces inference time by 40\% and peak memory usage by 50\%, while maintaining the original Attack Success Rate (ASR). The code has been submitted and will be released publicly.

🔍 Key Points

  • Introduction of Prefix-Shared KV Cache (PSKV), an optimization technique that reduces computational overhead in suffix jailbreak attacks by sharing key-value caches for fixed prefixes across multiple candidate suffixes.
  • PSKV enables a significant reduction in both inference time (40% faster) and peak memory usage (50% less) compared to conventional methods, while preserving the original Attack Success Rate (ASR).
  • The methodology includes suffix-centric alignment strategies facilitating efficient processing of batched prompts with variable lengths, addressing both computation and memory limitations in existing frameworks.
  • Extensive experimental validation demonstrated PSKV's effectiveness across multiple large language models (LLMs) and various suffix attack methodologies, highlighting its practical applicability for red-teaming scenarios.
  • PSKV functions as a plug-and-play solution that does not alter the fundamental logic of existing attack strategies, making it easy to integrate into current systems.

💡 Why This Paper Matters

This paper presents a significant advancement in the efficiency of suffix jailbreak attacks on large language models, which are critical for evaluating and improving AI security. By leveraging the novel PSKV technique, researchers can effectively reduce the resource-intensive nature of such attacks, enabling broader and faster evaluations of AI systems' vulnerabilities.

🎯 Why It's Interesting for AI Security Researchers

The findings and methodologies detailed in this paper are highly pertinent to AI security researchers, particularly in the context of assessing the robustness of large language models against adversarial prompts. As AI systems become more integrated into various applications, ensuring their safety and reliability becomes paramount. PSKV not only enhances the efficiency of vulnerability assessments but also paves the way for more extensive exploration of model weaknesses, making it a critical contribution to ongoing efforts in AI security.

📚 Read the Full Paper