Kavli Affiliate: Long Zhang | First 5 Authors: Jianyu Zhang, Yongwang Zhao, Long Zhang, Jilin Hu, Xiaokun Luan | Summary: Large language models (LLMs) for formal theorem proving have become a prominent research focus. At present, the proving ability of these LLMs is mainly evaluated through proof pass rates on datasets such as miniF2F. However, […]
Continue.. Psychometric-Based Evaluation for Theorem Proving with Large Language Models