deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses

📄 Abstract

Although Rust ensures memory safety by default, it also permits the use of unsafe code, which can introduce memory safety vulnerabilities if misused. Unfortunately, existing tools for detecting memory bugs in Rust typically exhibit limited detection capabilities, inadequately handle Rust-specific types, or rely heavily on manual intervention. To address these limitations, we present deepSURF, a tool that integrates static analysis with Large Language Model (LLM)-guided fuzzing harness generation to effectively identify memory safety vulnerabilities in Rust libraries, specifically targeting unsafe code. deepSURF introduces a novel approach for handling generics by substituting them with custom types and generating tailored implementations for the required traits, enabling the fuzzer to simulate user-defined behaviors within the fuzzed library. Additionally, deepSURF employs LLMs to augment fuzzing harnesses dynamically, facilitating exploration of complex API interactions and significantly increasing the likelihood of exposing memory safety vulnerabilities. We evaluated deepSURF on 27 real-world Rust crates, successfully rediscovering 20 known memory safety bugs and uncovering 6 previously unknown vulnerabilities, demonstrating clear improvements over state-of-the-art tools.

🔍 Key Points

Introduction of deepSURF, a novel tool that integrates static analysis with Large Language Model (LLM)-guided fuzzing harness generation to detect memory safety vulnerabilities in Rust, specifically targeting unsafe code.
Novel handling of Rust generics by substituting them with custom types and generating tailored implementations, allowing the fuzzer to better simulate user-defined behaviors and interact with complex types.
LLM augmentation enhances the dynamic exploration of API interactions during fuzzing, significantly increasing the likelihood of uncovering memory safety vulnerabilities.
Empirical evaluation of deepSURF on 27 real-world Rust crates, successfully rediscovering 20 known memory safety bugs and discovering 6 previously unknown bugs, demonstrating its superior bug-finding capabilities compared to existing tools.
DeepSURF addresses various challenges in fuzzing Rust libraries, such as targeting unsafe APIs, supporting complex and generic types, and generating meaningful API call sequences for testing.

💡 Why This Paper Matters

The paper presents deepSURF, an innovative approach to detecting memory safety vulnerabilities in Rust libraries, effectively demonstrating how static analysis combined with LLM-guided fuzzing can significantly enhance the identification of hidden bugs in a complex programming environment. This research is vital as it contributes to making Rust, known for its memory safety properties, more resilient against vulnerabilities introduced by unsafe code practices.

🎯 Why It's Interesting for AI Security Researchers

This paper is of particular interest to AI security researchers as it combines cutting-edge AI techniques, specifically large language models, with practical software security challenges in programming languages known for their safety guarantees, such as Rust. The methodology and findings could inspire further exploration into AI-assisted software testing, vulnerability detection, and the integration of LLMs in security-oriented applications, broadening the scope of AI's role in enhancing software security.

deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper