Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning (2201.09736v3)

Published 21 Jan 2022 in cs.LG and cs.AI

Abstract: Value-function (VF) approximation is a central problem in Reinforcement Learning (RL). Classical non-parametric VF estimation suffers from the curse of dimensionality. As a result, parsimonious parametric models have been adopted to approximate VFs in high-dimensional spaces, with most efforts being focused on linear and neural-network-based approaches. Differently, this paper puts forth a a parsimonious non-parametric approach, where we use stochastic low-rank algorithms to estimate the VF matrix in an online and model-free fashion. Furthermore, as VFs tend to be multi-dimensional, we propose replacing the classical VF matrix representation with a tensor (multi-way array) representation and, then, use the PARAFAC decomposition to design an online model-free tensor low-rank algorithm. Different versions of the algorithms are proposed, their complexity is analyzed, and their performance is assessed numerically using standardized RL environments.

View on arXiv

Authors (3)

Sergio Rozada (12 papers)
Santiago Paternain (50 papers)
Antonio G. Marques (78 papers)

Citations (10)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning (2201.09736v3)

Summary

Related Papers