We study constrained nested stochastic optimization problems in which the objective function is a composition of two smooth functions whose exact values and derivatives are not available. We propose a single timescale stochastic approximation algorithm, which we call the nested averaged stochastic approximation (NASA), to find an approximate stationary point of the problem. The algorithm has two auxiliary averaged sequences (filters) which estimate the gradient of the composite objective function and the inner function value. By using a special Lyapunov function, we show that the NASA achieves the sample complexity of \scrO (1/\varepsilon 2) for finding an \varepsilon -approximate stationary point, thus outperforming all extant methods for nested stochastic approximation. Our method and its analysis are the same for both unconstrained and constrained problems, without any need of batch samples for constrained nonconvex stochastic optimization. We also present a simplified parameter-free variant of the NASA method for solving constrained single-level stochastic optimization problems, and we prove the same complexity result for both unconstrained and constrained problems.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Compositional optimization
- Machine learning
- Stochastic approximation
- Stochastic gradient
- Stochastic variational inequality