← Back to Library

Learning from Risk: LLM-Guided Generation of Safety-Critical Scenarios with Prior Knowledge

Authors: Yuhang Wang, Heye Huang, Zhenhua Xu, Kailai Sun, Baoshen Guo, Jinhua Zhao

Published: 2025-11-25

arXiv ID: 2511.20726v1

Added to Library: 2025-11-27 03:01 UTC

Safety

📄 Abstract

Autonomous driving faces critical challenges in rare long-tail events and complex multi-agent interactions, which are scarce in real-world data yet essential for robust safety validation. This paper presents a high-fidelity scenario generation framework that integrates a conditional variational autoencoder (CVAE) with a large language model (LLM). The CVAE encodes historical trajectories and map information from large-scale naturalistic datasets to learn latent traffic structures, enabling the generation of physically consistent base scenarios. Building on this, the LLM acts as an adversarial reasoning engine, parsing unstructured scene descriptions into domain-specific loss functions and dynamically guiding scenario generation across varying risk levels. This knowledge-driven optimization balances realism with controllability, ensuring that generated scenarios remain both plausible and risk-sensitive. Extensive experiments in CARLA and SMARTS demonstrate that our framework substantially increases the coverage of high-risk and long-tail events, improves consistency between simulated and real-world traffic distributions, and exposes autonomous driving systems to interactions that are significantly more challenging than those produced by existing rule- or data-driven methods. These results establish a new pathway for safety validation, enabling principled stress-testing of autonomous systems under rare but consequential events.

🔍 Key Points

  • Introduction of a novel scenario generation framework that combines Conditional Variational Autoencoders (CVAE) with Large Language Models (LLM) to enhance the generation of safety-critical traffic scenarios.
  • The proposed framework effectively addresses the scarcity of rare long-tail events in real-world datasets by employing a knowledge-driven optimization process that balances realism with controllability.
  • Extensive experiments validate that the framework significantly increases the coverage of high-risk and long-tail events, demonstrating its superiority over existing rule-based or data-driven methods in exposing autonomous systems to complex interactions.
  • The framework allows for the dynamic adjustment of loss functions through LLM guidance, providing a formal mechanism to synthesize diverse, risk-sensitive scenarios that are crucial for the stress-testing of autonomous driving systems.
  • Results indicate that the generated scenarios maintain high realism and real-world distribution fidelity, aiding in the more reliable validation of autonomous driving technologies.

💡 Why This Paper Matters

This paper presents a significant advancement in the field of autonomous driving safety validation through the development of a hybrid generative framework that utilizes modern machine learning techniques to enhance the simulation of critical driving scenarios. By effectively bridging the gap between simulated environments and real-world traffic dynamics, the proposed approach offers a robust platform for safely stress-testing and validating autonomous driving systems against rare but potentially catastrophic driving scenarios.

🎯 Why It's Interesting for AI Security Researchers

For AI security researchers, this paper holds crucial interest because it addresses a key aspect of AI safety: the generation of adversarial scenarios that can expose vulnerabilities in autonomous systems. The methodological integration of generative models with knowledge-driven optimization is particularly relevant for research focused on ensuring the robustness and reliability of AI in high-stakes applications like autonomous driving. Moreover, insights from the discussion on risk-sensitive scenario generation can inform broader discussions on adversarial machine learning and secure AI deployment.

📚 Read the Full Paper