Stochastic Approximation algorithms are used to approximate solutions to fixed point equations that involve expectations of functions with respect to possibly unknown distributions. The most famous examples today are TD- and Q-learning algorithms. This three hour tutorial lecture series, courtesy of the Simon Institute for the Theory of Computing at UC Berkeley, will consist of two parts:
- The basics: an overview of stochastic approximation, with a focus on optimizing the rate of convergence. An algorithm that gives the best rate is analogous to the Newton-Raphson algorithm.
- Left-overs and review from part 1, and applications to reinforcement learning for discounted-cost optimal control, and optimal stopping problems. Theory from Part 1 leads to the new Zap Q-learning algorithm. Analysis suggests that its transient behavior is a close match to a deterministic Newton-Raphson implementation, and numerical experiments confirm super fast convergence.
Slides for the lectures can be downloaded HERE.
Sign up for the free insideAI News newsletter.