Kavli Affiliate: Jia Liu | First 5 Authors: Ziyue Luo, Jia Liu, Myungjin Lee, Ness B. Shroff, | Summary: The recent explosive growth of deep learning (DL) models has necessitated a compelling need for efficient job scheduling for distributed deep learning training with mixed parallelisms (DDLwMP) in GPU clusters. This paper proposes an adaptive shortest-remaining-processing-time-first […]
Continue.. Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters