Proximal Algorithms and Temporal Differences for Large Linear Systems: Extrapolation, Approximation, and Simulation (1610.05427v4)

Published 18 Oct 2016 in cs.NA

Abstract: We consider large linear and nonlinear fixed point problems, and solution with proximal algorithms. We show that there is a close connection between two seemingly different types of methods from distinct fields: 1) Proximal iterations for linear systems of equations, which are prominent in numerical analysis and convex optimization, and 2) Temporal difference (TD) type methods, such as TD(lambda), LSTD(lambda), and LSPE(lambda), which are central in simulation-based approximate dynamic programming/reinforcement learning (DP/RL), and its recent prominent successes in large-scale game contexts, among others. One benefit of this connection is a new and simple way to accelerate the standard proximal algorithm by extrapolation towards the TD iteration, which generically has a faster convergence rate. Another benefit is the potential integration into the proximal algorithmic context of several new ideas that have emerged in the DP/RL context. We discuss some of the possibilities, and in particular, algorithms that project each proximal iterate onto the subspace spanned by a small number of basis functions, using low-dimensional calculations and simulation. A third benefit is that insights and analysis from proximal algorithms can be brought to bear on the enhancement of TD methods. The linear fixed point methodology can be extended to nonlinear fixed point problems involving a contraction, thus providing guaranteed and potentially substantial acceleration of the proximal and forward backward splitting algorithms at no extra cost. Moreover, the connection of proximal and TD methods can be extended to nonlinear (nondifferentiable) fixed point problems through new proximal-like algorithms that involve successive linearization, similar to policy iteration in DP.

Citations (5)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Proximal Algorithms and Temporal Differences for Large Linear Systems: Extrapolation, Approximation, and Simulation (1610.05427v4)

Summary

Related Papers