Events
DMS Applied and Computational Mathematics Seminar |
| Time: Mar 06, 2026 (02:00 PM) |
| Location: 328 Parker Hall |
|
Details: ![]() Speaker: David Shirokoff (New Jersey Institute of Technology)
Title: Convergence of Markov Chains for Stochastic Gradient Descent and the Failure of the Diffusion Approximation over Long Times
Abstract: Stochastic gradient descent (SGD) is a popular algorithm for minimizing objective functions that arise in machine learning. While convergence theory is well established for SGD with vanishing step-sizes, less is known about the constant step-size setting. A common approach is to approximate the SGD iterates by a diffusion approximation stochastic ODE. However, this approximation is, in general, only valid for finite times. Focusing on a class of nonconvex objective functions, we establish a "Doeblin-type decomposition," for the SGD Markov chain in that the state space decomposes into a uniformly transient set and a disjoint union of absorbing sets. Each absorbing set contains a unique (ergodic) invariant measure, with the set of invariant measures being a global attractor of the Markov chain. The theory is highlighted with examples that show (1) the failure of the diffusion approximation to characterize the long-time dynamics of SGD; (2) the global minimum of an objective function may lie outside the support of the invariant measures (i.e., even if initialized at the global minimum, SGD iterates will leave); and (3) bifurcations may enable the SGD iterates to transition between local minima. This is joint work with Philip Zaleski.
Host: Thi-Thao-Phuong Hoang
For more, see Applied and Computational Math Seminar t
|
