Research Highlights of PurCL

Under the DARPA VSPELLS program, we've developed a suite of advanced analysis and testing techniques spanning diverse program domains for code lifting and understanding. Below, you can find a list of the most notable achievements.

Selected Achievements

• A formal method-based program lifting technique that abstracts input formats using symbolic finite automata, enabling downstream tasks such as fuzzing (StateLifter, NetLifter) and differential symbolic analysis (ParDiff). ParDiff was awarded with OOPSLA 2024 Distinguished Paper Award.
• Lifting network protocol specifications from documentation using large language models, supporting parser validation (ParCleanse) and functional bug detection (RFCScan).
• Decompiler augmentation using large language models for variable and type recovery (GymNM and ReSym), enhancing malware reverse engineering and binary code summarization. ReSym was awarded with CCS 2024 Distinguished Paper Award.
• Autonomous LLM agents for repository-level code understanding and auditing (LLMDFA, LLMSAN, and RepoAudit), enabling build-free, customizable bug detection during development. The agents have uncovered over 300 previously unknown vulnerabilities in open-source projects (Bug Gallery), drawing attention from industry leaders like GitHub CodeQL.