← Back to Library

RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

Authors: Yang Yang, Hua XU, Zhangyi Hu, Yutao Yue

Published: 2025-10-22

arXiv ID: 2510.19698v1

Added to Library: 2025-11-14 23:08 UTC

📄 Abstract

Large Language Models (LLMs) can propose rules in natural language, sidestepping the need for a predefined predicate space in traditional rule learning. Yet many LLM-based approaches ignore interactions among rules, and the opportunity to couple LLMs with probabilistic rule learning for robust inference remains underexplored. We present RLIE, a unified framework that integrates LLMs with probabilistic modeling to learn a set of weighted rules. RLIE has four stages: (1) Rule generation, where an LLM proposes and filters candidates; (2) Logistic regression, which learns probabilistic weights for global selection and calibration; (3) Iterative refinement, which updates the rule set using prediction errors; and (4) Evaluation, which compares the weighted rule set as a direct classifier with methods that inject rules into an LLM. We evaluate multiple inference strategies on real-world datasets. Applying rules directly with their learned weights yields superior performance, whereas prompting LLMs with the rules, weights, and logistic-model outputs surprisingly degrades accuracy. This supports the view that LLMs excel at semantic generation and interpretation but are less reliable for precise probabilistic integration. RLIE clarifies the potential and limitations of LLMs for inductive reasoning and couples them with classic probabilistic rule combination methods to enable more reliable neuro-symbolic reasoning.

🔍 Key Points

  • Introduction of Soft Instruction Control (SIC), an iterative prompt sanitization loop to defend against prompt injection attacks in tool-augmented LLM agents.
  • SIC modularly processes untrusted input by rewriting, masking, or removing instructions, ensuring only safe commands reach the agent.
  • Empirical evaluations reveal that SIC achieves a 0% attack success rate (ASR) under a range of adversarial attacks, substantially reducing the risk of compromised agent behavior.
  • SIC maintains high utility on benign tasks while critically engaging with security-utility trade-offs; examples show careful balancing of benign instructions and attack prevention.

💡 Why This Paper Matters

The presented SIC method marks a significant advancement in the defense strategies against prompt injection attacks for Large Language Models integrated within autonomous systems. It provides a practical and effective solution that allows agents to operate securely while interacting with untrusted data, highlighting that security against adversarial inputs can be enhanced significantly without drastic compromises on performance.

🎯 Why It's Interesting for AI Security Researchers

This paper is crucial for AI security researchers as it addresses the emerging vulnerabilities faced by large language models in agentic systems, particularly the risks posed by prompt injection attacks. The systematic approach for mitigating these risks, along with empirical evaluations and comparative analyses against existing defenses, provides valuable insights for current and future research in AI security. It raises awareness about the necessity of robust defenses in deployed AI systems, especially as they become more autonomous and integrated into various applications.

📚 Read the Full Paper