Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
51 tokens/sec
2000 character limit reached

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error (2201.12417v2)

Published 28 Jan 2022 in cs.LG, cs.AI, and stat.ML

Abstract: In this work, we study the use of the BeLLMan equation as a surrogate objective for value prediction accuracy. While the BeLLMan equation is uniquely solved by the true value function over all state-action pairs, we find that the BeLLMan error (the difference between both sides of the equation) is a poor proxy for the accuracy of the value function. In particular, we show that (1) due to cancellations from both sides of the BeLLMan equation, the magnitude of the BeLLMan error is only weakly related to the distance to the true value function, even when considering all state-action pairs, and (2) in the finite data regime, the BeLLMan equation can be satisfied exactly by infinitely many suboptimal solutions. This means that the BeLLMan error can be minimized without improving the accuracy of the value function. We demonstrate these phenomena through a series of propositions, illustrative toy examples, and empirical analysis in standard benchmark domains.

Citations (25)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com