Kavli Affiliate: Ke Wang | First 5 Authors: Ke Wang, Nikolaos Dimitriadis, Alessandro Favero, Guillermo Ortiz-Jimenez, Francois Fleuret | Summary: Fine-tuning pre-trained models has become the standard approach to endow them with specialized knowledge, but it poses fundamental challenges. In particular, textit{(i)} fine-tuning often leads to catastrophic forgetting, where improvements on a target domain degrade […]
Continue.. LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging