LIPIcs.FSTTCS.2017.2.pdf
- Filesize: 383 kB
- 10 pages
This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares. This result is obtained by analyzing SGD as a stochastic process and by sharply characterizing the stationary covariance matrix of this process. The finite rate optimality characterization captures the constant factors and addresses model mis-specification.
Feedback for Dagstuhl Publishing