The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It

Authors: Zheng-Xin Yong, Beyza Ermis, Marzieh Fadaee, Stephen H. Bach, Julia Kreutzer

Published: 2025-05-30

arXiv ID: 2505.24119v1

Added to Library: 2025-06-02 03:02 UTC

Safety

📄 Abstract

This paper presents a comprehensive analysis of the linguistic diversity of LLM safety research, highlighting the English-centric nature of the field. Through a systematic review of nearly 300 publications from 2020--2024 across major NLP conferences and workshops at *ACL, we identify a significant and growing language gap in LLM safety research, with even high-resource non-English languages receiving minimal attention. We further observe that non-English languages are rarely studied as a standalone language and that English safety research exhibits poor language documentation practice. To motivate future research into multilingual safety, we make several recommendations based on our survey, and we then pose three concrete future directions on safety evaluation, training data generation, and crosslingual safety generalization. Based on our survey and proposed directions, the field can develop more robust, inclusive AI safety practices for diverse global populations.

🔍 Key Points

The paper conducts a systematic review of LLM safety research from 2020-2024, finding significant underrepresentation of non-English languages in this field, underscoring an English-centric bias in safety assessments and methodologies.
The authors identify three main areas for future research: improving multilingual safety evaluation metrics, developing culturally-contextualized training data, and understanding cross-lingual safety generalization challenges.
The research highlights the need for better language documentation practices in safety research, pointing out that many studies fail to specify the languages they address, which can lead to gaps in understanding the safety landscape.
The survey reveals disparities in safety subtopics, such as LLM alignment and toxicity, where non-English languages receive limited coverage, which could amplify societal harms within multilingual communities.
The paper provides actionable recommendations for researchers and conference organizers to promote multilingual LLM safety, aiming for a more equitable distribution of AI benefits and risks.

💡 Why This Paper Matters

This paper is highly relevant as it highlights critical gaps in multilingual LLM safety research, an area that is increasingly vital as AI tools are deployed globally across diverse linguistic populations. By addressing issues of linguistic bias and advocating for a more inclusive approach towards safety assessments, this work proposes a framework for ensuring that LLMs can safely operate in any language. It thus plays a crucial role in fostering awareness and guiding future research towards more equitable AI practices.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers would find this paper interesting as it uncovers significant vulnerabilities in the safety of large language models when used in non-English contexts. The findings emphasize the importance of developing robust safety mechanisms that account for linguistic and cultural diversity, presenting new lines of investigation into how existing models might fail in multilingual environments. Furthermore, the proposed methodologies for evaluating and improving multilingual LLM safety align directly with the goals of ensuring AI systems are safe, reliable, and ethically aligned across all user demographics.

The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper