Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revisiting Bellman Errors for Offline Model Selection (2302.00141v2)

Published 31 Jan 2023 in cs.LG, cs.AI, and stat.ML

Abstract: Offline model selection (OMS), that is, choosing the best policy from a set of many policies given only logged data, is crucial for applying offline RL in real-world settings. One idea that has been extensively explored is to select policies based on the mean squared BeLLMan error (MSBE) of the associated Q-functions. However, previous work has struggled to obtain adequate OMS performance with BeLLMan errors, leading many researchers to abandon the idea. To this end, we elucidate why previous work has seen pessimistic results with BeLLMan errors and identify conditions under which OMS algorithms based on BeLLMan errors will perform well. Moreover, we develop a new estimator of the MSBE that is more accurate than prior methods. Our estimator obtains impressive OMS performance on diverse discrete control tasks, including Atari games.

Citations (4)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com