Learning Bug Context for PyTorch-to-JAX Translation with LLMs

📄 Abstract

Despite recent progress of large language models (LLMs) on code translation among mainstream languages, translating PyTorch to JAX remains nontrivial. The two libraries, though both embedded in Python, differ in core design, execution semantics, and ecosystem maturity; JAX is newer and comparatively underrepresented in public code, and parallel PyTorch--JAX corpora are limited. Weaknesses in existing evaluation further complicate cross-framework benchmarking. We present T2J, a prompt-augmentation framework that strengthens LLM-based PyTorch to JAX translation. Our pipeline (i) assembles two PyTorch sources -- the problem-solving set from TorchLeet (Aroori & Chien, 2025) and a GitHub-derived set from CodeParrot (Wolf et al., 2022) -- and uses GPT-4o-mini to produce initial JAX drafts; (ii) engages two professional developers to iteratively repair those drafts until functional equivalence, yielding a curated fixed-bug dataset of common errors and patches; and (iii) constructs augmented prompts that inject structured guidance from these fixes to steer lightweight LLMs (e.g., GPT-4o-mini). We also introduce three metrics tailored to PyTorch to JAX: T2J CodeTrans Score, T2J FixCost Score (an LLM-based estimate of bug-fix effort), and T2J Comparison Score (LLM-as-judge). Empirically, T2J raises GPT-4o-mini performance by up to 10% on CodeBLEU, 50% on T2J FixCost Score, 1.33 points on T2J CodeTrans Score (0--4 scale), and 100% on T2J Comparison Score; moreover, the generated code runs up to 2.5x faster than the baseline.

🔍 Key Points

The paper introduces the concept of Deep Research (DR) agents that leverage LLMs to perform complex research tasks, revealing significant vulnerabilities when such agents respond to harmful queries.
It outlines two novel jailbreak strategies—Plan Injection and Intent Hijack—that exploit the planning and research capabilities of DR agents, demonstrating their risks in generating harmful content.
Extensive experiments highlight that DR agents can circumvent traditional alignment mechanisms by producing coherent and dangerous reports that standalone LLMs would reject.
The proposed DeepREJECT evaluation metric is introduced, which assesses whether the generated content aligns with harmful intents and the quality of knowledge provided, outperforming previous benchmarks.
The findings raise critical questions about the safety measures in deploying LLMs in sensitive domains, especially in contexts like biosecurity.

💡 Why This Paper Matters

This paper is crucial as it identifies the elevated risks associated with Deep Research agents powered by Large Language Models, emphasizing the urgent need for refined safety analyses and robust alignment strategies. The methodologies proposed offer significant insights into the potential for misuse in high-stakes domains, calling for an overhaul in how AI systems are designed to ensure safety in practical applications.

🎯 Why It's Interesting for AI Security Researchers

The paper will intrigue AI security researchers as it exposes the critical vulnerabilities in existing alignment frameworks when applied to advanced AI systems like DR agents. It provides novel attack methodologies that can inform the development of more robust safety protocols and prompts further investigation into the potential misuse of AI technologies in sensitive and high-risk environments.

Learning Bug Context for PyTorch-to-JAX Translation with LLMs

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper