← Back to Library

SAFENLIDB: A Privacy-Preserving Safety Alignment Framework for LLM-based Natural Language Database Interfaces

Authors: Ruiheng Liu, XiaoBing Chen, Jinyu Zhang, Qiongwen Zhang, Yu Zhang, Bailong Yang

Published: 2025-11-10

arXiv ID: 2511.06778v2

Added to Library: 2025-11-12 04:00 UTC

Safety

📄 Abstract

The rapid advancement of Large Language Models (LLMs) has driven significant progress in Natural Language Interface to Database (NLIDB). However, the widespread adoption of LLMs has raised critical privacy and security concerns. During interactions, LLMs may unintentionally expose confidential database contents or be manipulated by attackers to exfiltrate data through seemingly benign queries. While current efforts typically rely on rule-based heuristics or LLM agents to mitigate this leakage risk, these methods still struggle with complex inference-based attacks, suffer from high false positive rates, and often compromise the reliability of SQL queries. To address these challenges, we propose \textsc{SafeNlidb}, a novel privacy-security alignment framework for LLM-based NLIDB. The framework features an automated pipeline that generates hybrid chain-of-thought interaction data from scratch, seamlessly combining implicit security reasoning with SQL generation. Additionally, we introduce reasoning warm-up and alternating preference optimization to overcome the multi-preference oscillations of Direct Preference Optimization (DPO), enabling LLMs to produce security-aware SQL through fine-grained reasoning without the need for human-annotated preference data. Extensive experiments demonstrate that our method outperforms both larger-scale LLMs and ideal-setting baselines, achieving significant security improvements while preserving high utility. WARNING: This work may contain content that is offensive and harmful!

🔍 Key Points

  • Introduction of the SafeNlidb framework, which integrates privacy-preserving mechanisms specifically designed to enhance security in NLIDB interfaces utilizing LLMs.
  • Development of a comprehensive automated data synthesis process that creates hybrid chain-of-thought interaction samples, addressing the scarcity of secure NLIDB interaction data.
  • Innovation in optimization techniques through Alternating Preference Optimization (APO) to facilitate security-aware SQL generation without human-annotated preferences, providing fine-grained alignment between SQL performance and security.
  • Extensive experimental validation demonstrating SafeNlidb's superior capabilities in mitigating privacy risks while maintaining high utility across diverse NLIDB scenarios, outperforming both larger and smaller LLMs.
  • Creation of the ShieldSQL benchmark, offering a robust framework for evaluating security across various attack scenarios, enhancing future research in this area.

💡 Why This Paper Matters

The paper presents the SafeNlidb framework, a significant advancement towards ensuring the safe interaction of users with databases via natural language queries while maintaining privacy. Its automated data synthesis, coupled with innovative optimization techniques, represents a breakthrough in protecting sensitive data from potential adversarial actions, making it an essential contribution to the fields of NLP and database security.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers would find this paper relevant as it tackles critical challenges in secure database interactions using LLMs. The rigorous methodologies developed to prevent data leakage and enhance security through robust frameworks can directly inform the design of future AI systems. Additionally, the introduction of benchmark datasets such as ShieldSQL provides crucial tools for furthering research in AI safety and security.

📚 Read the Full Paper