Kavli Affiliate: Jing Wang | First 5 Authors: Haoze He, Jing Wang, Anna Choromanska, , | Summary: This work focuses on the decentralized deep learning optimization framework. We propose Adjacent Leader Decentralized Gradient Descent (AL-DSGD), for improving final model performance, accelerating convergence, and reducing the communication overhead of decentralized deep learning optimizers. AL-DSGD relies on […]
Continue.. Adjacent Leader Decentralized Stochastic Gradient Descent