Robust or Suggestible? Exploring Non-Clinical Induction in LLM Drug-Safety Decisions

📄 Abstract

Large language models (LLMs) are increasingly applied in biomedical domains, yet their reliability in drug-safety prediction remains underexplored. In this work, we investigate whether LLMs incorporate socio-demographic information into adverse event (AE) predictions, despite such attributes being clinically irrelevant. Using structured data from the United States Food and Drug Administration Adverse Event Reporting System (FAERS) and a persona-based evaluation framework, we assess two state-of-the-art models, ChatGPT-4o and Bio-Medical-Llama-3.8B, across diverse personas defined by education, marital status, employment, insurance, language, housing stability, and religion. We further evaluate performance across three user roles (general practitioner, specialist, patient) to reflect real-world deployment scenarios where commercial systems often differentiate access by user type. Our results reveal systematic disparities in AE prediction accuracy. Disadvantaged groups (e.g., low education, unstable housing) were frequently assigned higher predicted AE likelihoods than more privileged groups (e.g., postgraduate-educated, privately insured). Beyond outcome disparities, we identify two distinct modes of bias: explicit bias, where incorrect predictions directly reference persona attributes in reasoning traces, and implicit bias, where predictions are inconsistent, yet personas are not explicitly mentioned. These findings expose critical risks in applying LLMs to pharmacovigilance and highlight the urgent need for fairness-aware evaluation protocols and mitigation strategies before clinical deployment.

🔍 Key Points

The paper investigates socio-demographic biases in LLM predictions related to adverse events (AEs) in drug safety, revealing significant disparities based on education, housing stability, and insurance status.
A persona-based evaluation framework was developed to systematically assess bias in LLM outputs, indicating that socio-demographic attributes significantly influence predictions even when clinically irrelevant.
Two modes of bias were identified: explicit bias (direct references to socio-demographic traits) and implicit bias (predictions that vary noticeably across different personas without mentioning them explicitly).
Results demonstrated that underprivileged groups sometimes had higher prediction accuracy, contradicting the assumption of LLM neutrality in clinical applications.
The authors argue for fairness-aware evaluation protocols and mitigation strategies before deploying these LLMs in clinical contexts.

💡 Why This Paper Matters

This paper is a significant contribution to understanding the intersection of artificial intelligence and healthcare, specifically in pharmacovigilance. By highlighting the biases embedded in LLMs related to socio-demographic factors, it emphasizes the urgent need for fairness in these systems, as they can directly affect patient safety and equitable healthcare delivery. Consequently, it fosters the development of better evaluation and correction strategies for AI systems in clinical settings.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers would be interested in this paper as it touches on the ethical implications of deploying LLMs in high-stakes healthcare environments. The identified biases pose risks not only to patient outcomes but also to system integrity and trustworthiness. Understanding these vulnerabilities is crucial for developing frameworks that ensure AI systems are robust against biases and other security threats, thereby enhancing fairness and accountability in AI-driven decision-making.

Robust or Suggestible? Exploring Non-Clinical Induction in LLM Drug-Safety Decisions

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper