Kavli Affiliate: Max Tegmark | First 5 Authors: Ziming Liu, Max Tegmark, , , | Summary: Neural scaling laws (NSL) refer to the phenomenon where model performance improves with scale. Sharma & Kaplan analyzed NSL using approximation theory and predict that MSE losses decay as $N^{-alpha}$, $alpha=4/d$, where $N$ is the number of model parameters, […]
Continue.. A Neural Scaling Law from Lottery Ticket Ensembling