Kavli Affiliate: Wei Gao| Summary:Chain-of-Thought (CoT) prompting has improved LLM reasoning, but models often generate explanations that appear coherent while containing unfaithful intermediate steps. Existing self-evaluation approaches are prone to inherent biases: the model may confidently endorse coherence even when the step-to-step implication is not valid, leading to unreliable faithfulness evaluation. We propose FACT-E, a […]
Continue.. FACT-E: Causality-Inspired Evaluation for Trustworthy Chain-of-Thought Reasoning