Kavli Affiliate: Max Tegmark | First 5 Authors: Ziming Liu, Eric J. Michaud, Max Tegmark, , | Summary: Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive. We aim to understand grokking by analyzing the loss landscapes of neural networks, identifying the mismatch between training […]
Continue.. Omnigrok: Grokking Beyond Algorithmic Data