SafeGen-LLM: Enhancing Safety Generalization in Task Planning for Robotic Systems

Authors: Jialiang Fan, Weizhe Xu, Mengyu Liu, Oleg Sokolsky, Insup Lee, Fangxin Kong

Published: 2026-02-27

arXiv ID: 2602.24235v1

Added to Library: 2026-03-02 03:01 UTC

Safety

📄 Abstract

Safety-critical task planning in robotic systems remains challenging: classical planners suffer from poor scalability, Reinforcement Learning (RL)-based methods generalize poorly, and base Large Language Models (LLMs) cannot guarantee safety. To address this gap, we propose safety-generalizable large language models, named SafeGen-LLM. SafeGen-LLM can not only enhance the safety satisfaction of task plans but also generalize well to novel safety properties in various domains. We first construct a multi-domain Planning Domain Definition Language 3 (PDDL3) benchmark with explicit safety constraints. Then, we introduce a two-stage post-training framework: Supervised Fine-Tuning (SFT) on a constraint-compliant planning dataset to learn planning syntax and semantics, and Group Relative Policy Optimization (GRPO) guided by fine-grained reward machines derived from formal verification to enforce safety alignment and by curriculum learning to better handle complex tasks. Extensive experiments show that SafeGen-LLM achieves strong safety generalization and outperforms frontier proprietary baselines across multi-domain planning tasks and multiple input formats (e.g., PDDLs and natural language).

🔍 Key Points

Introduction of SafeGen-LLM, a novel framework for enhancing safety in task planning for robotic systems using Large Language Models (LLMs).
Development of a multi-domain Planning Domain Definition Language 3 (PDDL3) benchmark that incorporates explicit safety constraints, enabling systematic evaluation of safety-aware planning models.
Implementation of a two-stage post-training framework consisting of Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO), which improves safety compliance and planning success rates significantly compared to baseline methods.
Demonstration of cross-domain safety generalization, wherein SafeGen-LLM can effectively adapt to unseen planning problems across different domains while maintaining safety requirements.
Validation of the proposed framework in real-world robotic settings, showing the practical implications of safe and efficient task planning.

💡 Why This Paper Matters

This paper presents a critical advancement in robotic task planning by effectively combining advanced machine learning techniques with safety requirements to ensure that robotic systems operate safely in unpredictable environments. This approach not only enhances the robustness of LLMs for task planning but also sets a foundation for developing future safety-critical applications in robotics.

🎯 Why It's Interesting for AI Security Researchers

The focus on safety generalization in AI systems, particularly regarding the implications for autonomous robotic operations, makes this research highly pertinent for AI security researchers. As AI technologies become increasingly integrated into society, ensuring their safety and compliance with defined constraints is vital to prevent catastrophic failures in real-world applications, directly impacting public safety and trust in AI systems.

SafeGen-LLM: Enhancing Safety Generalization in Task Planning for Robotic Systems

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper