Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization

Kavli Affiliate: Yi Zhou

| First 5 Authors: Qi Zhang, Yi Zhou, Ashley Prater-Bennette, Lixin Shen, Shaofeng Zou

| Summary:

Distributionally robust optimization (DRO) is a powerful framework for
training robust models against data distribution shifts. This paper focuses on
constrained DRO, which has an explicit characterization of the robustness
level. Existing studies on constrained DRO mostly focus on convex loss
function, and exclude the practical and challenging case with non-convex loss
function, e.g., neural network. This paper develops a stochastic algorithm and
its performance analysis for non-convex constrained DRO. The computational
complexity of our stochastic algorithm at each iteration is independent of the
overall dataset size, and thus is suitable for large-scale applications. We
focus on the general Cressie-Read family divergence defined uncertainty set
which includes $chi^2$-divergences as a special case. We prove that our
algorithm finds an $epsilon$-stationary point with a computational complexity
of $mathcal O(epsilon^{-3k_*-5})$, where $k_*$ is the parameter of the
Cressie-Read divergence. The numerical results indicate that our method
outperforms existing methods.} Our method also applies to the smoothed
conditional value at risk (CVaR) DRO.

| Search Query: ArXiv Query: search_query=au:”Yi Zhou”&id_list=&start=0&max_results=3

Read More