Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Kavli Affiliate: Yi Zhou

| First 5 Authors: Ziyi Chen, Shaocong Ma, Yi Zhou, ,

| Summary:

Alternating gradient-descent-ascent (AltGDA) is an optimization algorithm
that has been widely used for model training in various machine learning
applications, which aims to solve a nonconvex minimax optimization problem.
However, the existing studies show that it suffers from a high computation
complexity in nonconvex minimax optimization. In this paper, we develop a
single-loop and fast AltGDA-type algorithm that leverages proximal gradient
updates and momentum acceleration to solve regularized nonconvex minimax
optimization problems. By leveraging the momentum acceleration technique, we
prove that the algorithm converges to a critical point in nonconvex minimax
optimization and achieves a computation complexity in the order of
$mathcal{O}(kappa^{frac{11}{6}}epsilon^{-2})$, where $epsilon$ is the
desired level of accuracy and $kappa$ is the problem’s condition number. {Such
a computation complexity improves the state-of-the-art complexities of
single-loop GDA and AltGDA algorithms (see the summary of comparison in
Cref{table1})}. We demonstrate the effectiveness of our algorithm via an
experiment on adversarial deep learning.

| Search Query: ArXiv Query: search_query=au:”Yi Zhou”&id_list=&start=0&max_results=10