Kavli Affiliate: Zhuo Li | First 5 Authors: Yuhao Du, Zhuo Li, Pengyu Cheng, Zhihong Chen, Yuejiao Xie | Summary: Reinforcement Learning from Human Feedback (RLHF) is crucial for aligning Large Language Models (LLMs) with human values. However, RLHF has been continuously challenged by its high complexity in implementation and computation consumption. Even with recent […]
Continue.. Simplify RLHF as Reward-Weighted SFT: A Variational Method