← Back to Library

Towards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and Safety

Authors: Muskaan Chopra, Lorenz Sparrenberg, Rafet Sifa

Published: 2026-02-11

arXiv ID: 2602.11444v1

Added to Library: 2026-02-13 03:02 UTC

Safety

📄 Abstract

Machine Translation (MT) plays a pivotal role in cross-lingual information access, public policy communication, and equitable knowledge dissemination. However, critical meaning errors, such as factual distortions, intent reversals, or biased translations, can undermine the reliability, fairness, and safety of multilingual systems. In this work, we explore the capacity of instruction-tuned Large Language Models (LLMs) to detect such critical errors, evaluating models across a range of parameters using the publicly accessible data sets. Our findings show that model scaling and adaptation strategies (zero-shot, few-shot, fine-tuning) yield consistent improvements, outperforming encoder-only baselines like XLM-R and ModernBERT. We argue that improving critical error detection in MT contributes to safer, more trustworthy, and socially accountable information systems by reducing the risk of disinformation, miscommunication, and linguistic harm, especially in high-stakes or underrepresented contexts. This work positions error detection not merely as a technical challenge, but as a necessary safeguard in the pursuit of just and responsible multilingual AI. The code will be made available at GitHub.

🔍 Key Points

  • The paper introduces a cross-model scaling study on using instruction-tuned large language models (LLMs) for critical error detection in machine translation, highlighting their superiority over traditional encoder-only models like XLM-R and ModernBERT.
  • It demonstrates that model scaling and various adaptation strategies (zero-shot, few-shot, and fine-tuning) can significantly improve the performance of LLMs in identifying critical meaning errors in translations, thus addressing issues of reliability and fairness in multilingual systems.
  • The research emphasizes the necessity of integrating critical error detection (CED) within translation workflows to ensure that multilingual AI systems are socially accountable and trustworthy, ultimately aiming to prevent disinformation and miscommunication in high-stakes contexts.
  • The study confirms that using lightweight committees of models not only enhances performance stability but also provides cost-effective solutions for critical error detection in multilingual settings, suggesting a practical approach for deploying these models in real-world scenarios.

💡 Why This Paper Matters

This paper is pivotal in highlighting the importance of critical error detection in machine translation, advocating for the use of instruction-tuned LLMs to ensure reliability and safety in multilingual communication. Its findings underscore the necessity for AI models to not only deliver accurate translations but also to ensure that these translations maintain the intended meaning and do not propagate bias or misinformation. The practical implications of this research support the development of safer and more equitable AI systems, which is crucial as global reliance on automated translation tools increases.

🎯 Why It's Interesting for AI Security Researchers

For AI security researchers, this paper is of significant interest as it addresses the risks associated with machine translation systems—particularly the propagation of misinformation and biased content through inaccurate translations. By exploring novel methodologies for enhancing critical error detection, the research connects directly to broader themes in AI safety, such as trustworthiness and ethical implications of AI deployment in critical applications. Understanding and mitigating these risks is essential for developing secure and responsible AI technologies.

📚 Read the Full Paper