← Back to Library

Jailbreaking Large Vision Language Models in Intelligent Transportation Systems

Authors: Badhan Chandra Das, Md Tasnim Jawad, Md Jueal Mia, M. Hadi Amini, Yanzhao Wu

Published: 2025-11-17

arXiv ID: 2511.13892v1

Added to Library: 2025-11-19 03:01 UTC

Red Teaming

📄 Abstract

Large Vision Language Models (LVLMs) demonstrate strong capabilities in multimodal reasoning and many real-world applications, such as visual question answering. However, LVLMs are highly vulnerable to jailbreaking attacks. This paper systematically analyzes the vulnerabilities of LVLMs integrated in Intelligent Transportation Systems (ITS) under carefully crafted jailbreaking attacks. First, we carefully construct a dataset with harmful queries relevant to transportation, following OpenAI's prohibited categories to which the LVLMs should not respond. Second, we introduce a novel jailbreaking attack that exploits the vulnerabilities of LVLMs through image typography manipulation and multi-turn prompting. Third, we propose a multi-layered response filtering defense technique to prevent the model from generating inappropriate responses. We perform extensive experiments with the proposed attack and defense on the state-of-the-art LVLMs (both open-source and closed-source). To evaluate the attack method and defense technique, we use GPT-4's judgment to determine the toxicity score of the generated responses, as well as manual verification. Further, we compare our proposed jailbreaking method with existing jailbreaking techniques and highlight severe security risks involved with jailbreaking attacks with image typography manipulation and multi-turn prompting in the LVLMs integrated in ITS.

🔍 Key Points

  • Introduction of a novel jailbreaking attack method leveraging image typography manipulation and multi-turn prompting to exploit vulnerabilities in Large Vision Language Models (LVLMs) within Intelligent Transportation Systems (ITS).
  • Creation of a dedicated dataset containing harmful queries specifically designed for transportation-related scenarios, enhancing the analysis of LVLM vulnerabilities.
  • Development of a multi-layered defense mechanism that incorporates both rule-based filtering and zero-shot classification to mitigate the impact of jailbreaking attacks on LVLMs.
  • Extensive experimental evaluation demonstrating the effectiveness of the proposed attack method, achieving high attack success rates and toxicity scores across multiple LVLMs.
  • Comparison with existing jailbreaking techniques, revealing significant security risks associated with LVLMs when integrated into time-sensitive and safety-critical systems, such as ITS.

💡 Why This Paper Matters

This paper highlights critical vulnerabilities in Large Vision Language Models (LVLMs) integrated within Intelligent Transportation Systems (ITS) and proposes innovative methods for both attack and defense. The findings underscore the urgent need for robust security measures in AI applications that influence real-world safety and decision-making. As transportation systems increasingly rely on sophisticated AI models, understanding and mitigating potential threats is paramount.

🎯 Why It's Interesting for AI Security Researchers

This paper will be of particular interest to AI security researchers as it addresses the intersection of artificial intelligence and cybersecurity within the transportation sector, a domain where safety and reliability are paramount. The exploration of novel attack techniques against LVLMs, combined with a discussion of defense strategies, provides valuable insights into not only the vulnerabilities of these systems but also the construction of robust countermeasures. Furthermore, the implications of such vulnerabilities can inform future developments in AI governance and safe AI model deployment.

📚 Read the Full Paper