AdaSelection: Accelerating Deep Learning Training through Data Subsampling

Kavli Affiliate: Jia Liu

| First 5 Authors: Minghe Zhang, Chaosheng Dong, Jinmiao Fu, Tianchen Zhou, Jia Liang

| Summary:

In this paper, we introduce AdaSelection, an adaptive sub-sampling method to
identify the most informative sub-samples within each minibatch to speed up the
training of large-scale deep learning models without sacrificing model
performance. Our method is able to flexibly combines an arbitrary number of
baseline sub-sampling methods incorporating the method-level importance and
intra-method sample-level importance at each iteration. The standard practice
of ad-hoc sampling often leads to continuous training with vast amounts of data
from production environments. To improve the selection of data instances during
forward and backward passes, we propose recording a constant amount of
information per instance from these passes. We demonstrate the effectiveness of
our method by testing it across various types of inputs and tasks, including
the classification tasks on both image and language datasets, as well as
regression tasks. Compared with industry-standard baselines, AdaSelection
consistently displays superior performance.

| Search Query: ArXiv Query: search_query=au:”Jia Liu”&id_list=&start=0&max_results=10