Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Stochastic Gradient Descent for Learning with Distributionally Robust Optimization (1805.08728v2)

Published 22 May 2018 in stat.ML, cs.LG, and math.OC

Abstract: Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore provide a new stochastic gradient descent algorithm to efficiently solve this DRO formulation. Our approach applies gradient descent to the outer minimization formulation and estimates the gradient of the inner maximization based on a sample average approximation. The latter uses a subset of the data in each iteration, progressively increasing the subset size to ensure convergence. Theoretical results include establishing the optimal manner for growing the support size to balance a fundamental tradeoff between stochastic error and computational effort. Empirical results demonstrate the significant benefits of our approach over previous work, and also illustrate how learning with DRO can improve generalization.

Citations (10)

Summary

We haven't generated a summary for this paper yet.