← Back to Library

Towards Trustworthy Lexical Simplification: Exploring Safety and Efficiency with Small LLMs

Authors: Akio Hayakawa, Stefan Bott, Horacio Saggion

Published: 2025-09-29

arXiv ID: 2509.25086v1

Added to Library: 2025-09-30 04:05 UTC

Safety

📄 Abstract

Despite their strong performance, large language models (LLMs) face challenges in real-world application of lexical simplification (LS), particularly in privacy-sensitive and resource-constrained environments. Moreover, since vulnerable user groups (e.g., people with disabilities) are one of the key target groups of this technology, it is crucial to ensure the safety and correctness of the output of LS systems. To address these issues, we propose an efficient framework for LS systems that utilizes small LLMs deployable in local environments. Within this framework, we explore knowledge distillation with synthesized data and in-context learning as baselines. Our experiments in five languages evaluate model outputs both automatically and manually. Our manual analysis reveals that while knowledge distillation boosts automatic metric scores, it also introduces a safety trade-off by increasing harmful simplifications. Importantly, we find that the model's output probability is a useful signal for detecting harmful simplifications. Leveraging this, we propose a filtering strategy that suppresses harmful simplifications while largely preserving beneficial ones. This work establishes a benchmark for efficient and safe LS with small LLMs. It highlights the key trade-offs between performance, efficiency, and safety, and demonstrates a promising approach for safe real-world deployment.

🔍 Key Points

  • Proposes an efficient framework for lexical simplification (LS) that utilizes small LLMs, addressing challenges in privacy-sensitive and resource-constrained environments.
  • Investigates the use of knowledge distillation with synthesized data and in-context learning for LS, while emphasizing the safety and correctness of outputs, which is critical for vulnerable user groups.
  • Identifies a significant safety trade-off: while knowledge distillation improves automatic performance metrics, it also leads to higher rates of harmful simplifications.
  • Introduces a novel filtering strategy that utilizes output probabilities from the model as a signal for detecting and suppressing harmful simplifications, while preserving beneficial outputs.
  • Establishes a benchmark for evaluating safety and efficiency in LS systems using small LLMs across multiple languages (English, Spanish, Catalan, German, Japanese).

💡 Why This Paper Matters

This paper is pivotal for advancing the field of lexical simplification by providing a framework that balances efficiency and safety through the use of small LLMs. It highlights the importance of leveraging advanced methodologies to ensure the output does not only simplify language but does so safely for vulnerable populations. The findings lay the groundwork for future research focused on real-world applications of simplification systems, making it an essential contribution to both AI and accessibility in technology.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers will find this paper valuable as it tackles the dual challenge of ensuring data privacy while deploying language models in real-world applications. The emphasis on detecting harmful simplifications and establishing benchmarks for safety signals presents a critical intersection between AI performance and ethical considerations in AI usage. Moreover, the insights regarding knowledge distillation safety trade-offs provide a deeper understanding of potential vulnerabilities in deploying language models, thereby informing security protocols and best practices for responsible AI development.

📚 Read the Full Paper