Automated Analysis of Global AI Safety Initiatives: A Taxonomy-Driven LLM Approach

📄 Abstract

We present an automated crosswalk framework that compares an AI safety policy document pair under a shared taxonomy of activities. Using the activity categories defined in Activity Map on AI Safety as fixed aspects, the system extracts and maps relevant activities, then produces for each aspect a short summary for each document, a brief comparison, and a similarity score. We assess the stability and validity of LLM-based crosswalk analysis across public policy documents. Using five large language models, we perform crosswalks on ten publicly available documents and visualize mean similarity scores with a heatmap. The results show that model choice substantially affects the crosswalk outcomes, and that some document pairs yield high disagreements across models. A human evaluation by three experts on two document pairs shows high inter-annotator agreement, while model scores still differ from human judgments. These findings support comparative inspection of policy documents.

🔍 Key Points

Proposes an automated crosswalk framework for comparing AI safety policy documents using a shared taxonomy (AMAIS) to enhance interoperability among different policy frameworks.
Implements a detailed LLM-based analysis methodology that produces per-aspect summaries, comparisons, and similarity scores for AI safety activities across multiple documents.
Findings indicate significant variability in similarity scores depending on the language model used, with specific aspects showing high inter-model disagreement, suggesting the need for careful LLM selection or ensemble approaches in automated evaluations.
Validation through human evaluation revealed a calibration gap between LLM-generated scores and expert assessments, emphasizing the importance of human oversight in automated policy analysis.
Identifies limitations in current methodologies, including challenges in document extraction and mapping, language settings, and the ambiguity of activity categories, pointing to areas for future research.

💡 Why This Paper Matters

The proposed framework for automated crosswalk analysis represents a significant step toward standardizing comparisons of AI safety policies across organizations and jurisdictions. By grounding the analysis in a shared taxonomy and employing LLMs, it enhances the reproducibility and transparency of policy assessments. The insights gained from this study highlight the significance of model selection in automated analysis and underscore the need for continued integration of human expertise in AI safety evaluations.

🎯 Why It's Interesting for AI Security Researchers

This paper is crucial for AI security researchers as it addresses the pressing need for systematic comparisons of heterogeneous AI safety policies amidst a rapidly evolving regulatory landscape. The methodology provides valuable tools for analyzing alignment and discrepancies in policy approaches, facilitating a deeper understanding of global AI governance efforts. Additionally, the findings regarding model variability and inter-annotator agreement are vital for improving the reliability of automated policy evaluations, an area of significant interest in the field.

Automated Analysis of Global AI Safety Initiatives: A Taxonomy-Driven LLM Approach

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper