NVR: Vector Runahead on NPUs for Sparse Memory Access

Kavli Affiliate: Jing Wang | First 5 Authors: Hui Wang, Zhengpeng Zhao, Jing Wang, Yushu Du, Yuan Cheng | Summary: Deep Neural Networks are increasingly leveraging sparsity to reduce the scaling up of model parameter size. However, reducing wall-clock time through sparsity and pruning remains challenging due to irregular memory access patterns, leading to frequent […]


Continue.. NVR: Vector Runahead on NPUs for Sparse Memory Access

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Kavli Affiliate: Ke Wang | First 5 Authors: Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma | Summary: Large Language Models (LLMs) have shown impressive reasoning capabilities in well-defined problems with clear solutions, such as mathematics and coding. However, they still struggle with complex real-world scenarios like business negotiations, which require strategic reasoning-an […]


Continue.. EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Kavli Affiliate: Ke Wang | First 5 Authors: Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma | Summary: Large Language Models (LLMs) have shown impressive reasoning capabilities in well-defined problems with clear solutions, such as mathematics and coding. However, they still struggle with complex real-world scenarios like business negotiations, which require strategic reasoning-an […]


Continue.. EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Kavli Affiliate: Ke Wang | First 5 Authors: Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma | Summary: Large Language Models (LLMs) have shown impressive reasoning capabilities in well-defined problems with clear solutions, such as mathematics and coding. However, they still struggle with complex real-world scenarios like business negotiations, which require strategic reasoning-an […]


Continue.. EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

EquiBench: Benchmarking Code Reasoning Capabilities of Large Language Models via Equivalence Checking

Kavli Affiliate: Ke Wang | First 5 Authors: Anjiang Wei, Jiannan Cao, Ran Li, Hongyu Chen, Yuhui Zhang | Summary: Equivalence checking, i.e., determining whether two programs produce identical outputs for all possible inputs, underpins a broad range of applications, including software refactoring, testing, and optimization. We present the task of equivalence checking as a […]


Continue.. EquiBench: Benchmarking Code Reasoning Capabilities of Large Language Models via Equivalence Checking

EquiBench: Benchmarking Code Reasoning Capabilities of Large Language Models via Equivalence Checking

Kavli Affiliate: Ke Wang | First 5 Authors: Anjiang Wei, Jiannan Cao, Ran Li, Hongyu Chen, Yuhui Zhang | Summary: Equivalence checking, i.e., determining whether two programs produce identical outputs for all possible inputs, underpins a broad range of applications, including software refactoring, testing, and optimization. We present the task of equivalence checking as a […]


Continue.. EquiBench: Benchmarking Code Reasoning Capabilities of Large Language Models via Equivalence Checking

Dominant Role of Coplanar Inflows in Driving Disk Evolution Revealed by Gas-Phase Metallicity Gradients

Kavli Affiliate: Yingjie Peng | First 5 Authors: Cheqiu Lyu, Enci Wang, Hongxin Zhang, Yingjie Peng, Xin Wang | Summary: Using spatially resolved spectroscopic data from the MaNGA sample, we investigate the parameters influencing the radial gradients of gas-phase metallicity ($nablalog(mathrm{O/H})$), to determine whether disk formation is primarily driven by coplanar gas inflow or by […]


Continue.. Dominant Role of Coplanar Inflows in Driving Disk Evolution Revealed by Gas-Phase Metallicity Gradients

CSST Large Scale Structure Analysis Pipeline: III. Emission-line Redshift Measurement for Slitless Spectra

Kavli Affiliate: Hu Zhan | First 5 Authors: Jipeng Sui, Hu Zou, Xiaohu Yang, Xianzhong Zheng, Run Wen | Summary: The China Space Station Telescope (CSST) is a forthcoming space-based optical telescope designed to co-orbit with the Chinese Space Station. With a planned slitless spectroscopic survey spanning a broad wavelength range of $255-1000$nm and an […]


Continue.. CSST Large Scale Structure Analysis Pipeline: III. Emission-line Redshift Measurement for Slitless Spectra

Simplify RLHF as Reward-Weighted SFT: A Variational Method

Kavli Affiliate: Zhuo Li | First 5 Authors: Yuhao Du, Zhuo Li, Pengyu Cheng, Zhihong Chen, Yuejiao Xie | Summary: Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning Large Language Models (LLMs) with human values. However, RLHF has been continuously challenged by its high complexity in implementation and computation consumption. Even with recent […]


Continue.. Simplify RLHF as Reward-Weighted SFT: A Variational Method

Simplify RLHF as Reward-Weighted SFT: A Variational Method

Kavli Affiliate: Zhuo Li | First 5 Authors: Yuhao Du, Zhuo Li, Pengyu Cheng, Zhihong Chen, Yuejiao Xie | Summary: Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning Large Language Models (LLMs) with human values. However, RLHF has been continuously challenged by its high complexity in implementation and computation consumption. Even with recent […]


Continue.. Simplify RLHF as Reward-Weighted SFT: A Variational Method