Events

DMS Applied and Computational Mathematics Seminar

Time: Mar 06, 2026 (02:00 PM)
Location: 328 Parker Hall

Details:
 shirokoff
 
Speaker: David Shirokoff (New Jersey Institute of Technology)
 
Title: Convergence of Markov Chains for Stochastic Gradient Descent and the Failure of the Diffusion Approximation over Long Times

Abstract: Stochastic gradient descent (SGD) is a popular algorithm for minimizing objective functions that arise in machine learning. While convergence theory is well established for SGD with vanishing step-sizes, less is known about the constant step-size setting.  A common approach is to approximate the SGD iterates by a diffusion approximation stochastic ODE.  However, this approximation is, in general, only valid for finite times.  Focusing on a class of nonconvex objective functions, we establish a "Doeblin-type decomposition," for the SGD Markov chain in that the state space decomposes into a uniformly transient set and a disjoint union of absorbing sets. Each absorbing set contains a unique (ergodic) invariant measure, with the set of invariant measures being a global attractor of the Markov chain. The theory is highlighted with examples that show (1) the failure of the diffusion approximation to characterize the long-time dynamics of SGD; (2) the global minimum of an objective function may lie outside the support of the invariant measures (i.e., even if initialized at the global minimum, SGD iterates will leave); and (3) bifurcations may enable the SGD iterates to transition between local minima. 
 
This is joint work with Philip Zaleski.
 
 
Host: Thi-Thao-Phuong Hoang