Kavli Affiliate: Long Zhang
| First 5 Authors: Zehan Chen, Zehan Chen, , ,
| Summary:
Automatically generating formal specifications from program code can greatly
enhance the efficiency of program verification and enable end-to-end automation
from requirements to reliable software. However, existing LLM-based approaches
often struggle with programs that include complex loop structures, leading to
irrelevant specifications. Moreover, the rigorous proof obligations and design
constraints imposed by verification tools can further result in incomplete and
ambiguous specifications. To address these challenges, we propose SLD-Spec, an
LLM-assisted specification generation method tailored for programs with complex
loop constructs. SLD-Spec introduces two novel phases into the traditional
specification generation framework: (1) A slicing phase, which decomposes each
function into code fragments containing independent loop structures, thereby
reducing the complexity of specification generation; and (2) A logical deletion
phase, which applies LLM-based reasoning to filter out incorrect candidate
specifications–especially those not easily identified by verification
tool–while retaining valid ones. Experimental results show that on the simple
dataset, SLD-Spec successfully verifies five more programs than the
state-of-the-art AutoSpec and reduces runtime by 23.73%. To address the
limitations of existing research, we manually construct a dataset comprising
four categories of complex loop programs. On this dataset, SLD-Spec
significantly improves the correctness, relevance, and completeness of
generated specifications compared to baseline methods, enabling 95.1% of
assertions and 90.91% of programs to pass verification. Ablation studies
further reveal that logical deletion is critical for enhancing specification
correctness and relevance, while program slicing contributes significantly to
specification completeness. Our code and data are publicly available.
| Search Query: ArXiv Query: search_query=au:”Long Zhang”&id_list=&start=0&max_results=3