Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety

📄 Abstract

Large language models (LLMs) trained on internet-scale corpora can exhibit systematic biases that increase the probability of unwanted behavior. In this study, we examined potential biases towards synthetic vs. biological technological solutions across four domains (materials, energy, manufacturing, and algorithms). A sample of 5 frontier and 5 open-weight models were measured using 50 curated Bioalignment prompts with a Kelly criterion-inspired evaluation framework. According to this metric, most models were not bioaligned in that they exhibit biases in favor of synthetic (non-biological) solutions. We next examined if fine-tuning could increase the preferences of two open-weight models, Llama 3.2-3B-Instruct and Qwen2.5-3B-Instruct, for biological-based approaches. A curated corpus of ~22M tokens from 6,636 PMC articles emphasizing biological problem-solving was used first to fine-tune Llama 3B with a mixed corpus of continued training and instruction-formatted. This was then extended to Qwen 3B using instruction-formatted only. We found that QLoRA fine-tuning significantly increased the scoring of biological solutions for both models without degrading general capabilities (Holm-Bonferroni-corrected p < 0.001 and p < 0.01, respectively). This suggests that even a small amount of fine-tuning can change how models weigh the relative value of biological and bioinspired vs. synthetic approaches. Although this work focused on small open-weight LLMs, it may be extensible to much larger models and could be used to develop models that favor bio-based approaches. We release the benchmark, corpus, code, and adapter weights.

🔍 Key Points

Introduction of the Bioalignment Benchmark, a framework with 50 prompts measuring LLM biases towards biological versus synthetic solutions across four engineering domains.
Demonstration of significant bias in existing LLMs, with most showing a preference for synthetic solutions over biological ones, highlighting a potential risk in AI recommendations.
Successful application of QLoRA fine-tuning to shift the preferences of LLaMA 3B and Qwen 3B towards biological solutions, achieving shifts in bias scores without degrading general capabilities.
Findings suggest that bioalignment can be achieved using relatively small datasets (as few as 0.5M tokens), emphasizing scalability for future research using larger models.
The results provide a pathway for enhancing AI safety by instilling biases towards biological approaches, potentially leading to more sustainable AI outcomes.

💡 Why This Paper Matters

This paper introduces a novel framework to measure and improve large language models' (LLMs) biases towards biological systems, underscoring the significance of bioalignment in AI safety. By evidencing how targeted fine-tuning can alter model dispositions, this research not only enhances our understanding of LLM behavior but also emphasizes the necessity for responsible AI development. The practical implications suggest that AI systems can be consciously engineered to favor sustainable and biologically-inspired solutions, which could offer a safeguard against harmful recommendations in various applications.

🎯 Why It's Interesting for AI Security Researchers

The findings of this paper are particularly relevant to AI security researchers as they address the critical issue of biases in LLMs, which can influence decision-making and recommendations in unforeseen ways. By proposing a method for bias correction towards biologically aligned solutions, this research opens up discussions on how AI systems can be made safer and more aligned with ecological and ethical standards, potentially preventing undesirable or harmful outcomes.

Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper