Kavli Affiliate: Max Tegmark | First 5 Authors: Matthew Chen, Joshua Engels, Max Tegmark, , | Summary: Sparse autoencoders (SAEs) decompose language model representations into a sparse set of linear latent vectors. Recent works have improved SAEs using language model gradients, but these techniques require many expensive backward passes during training and still cause a […]
Continue.. Low-Rank Adapting Models for Sparse Autoencoders