Kavli Affiliate: Max Tegmark | First 5 Authors: Yuxiao Li, Eric J. Michaud, David D. Baek, Joshua Engels, Xiaoqing Sun | Summary: Sparse autoencoders have recently produced dictionaries of high-dimensional vectors corresponding to the universe of concepts represented by large language models. We find that this concept universe has interesting structure at three levels: 1) […]
Continue.. The Geometry of Concepts: Sparse Autoencoder Feature Structure