Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 92 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 36 tok/s
GPT-5 High 36 tok/s Pro
GPT-4o 113 tok/s
GPT OSS 120B 472 tok/s Pro
Kimi K2 214 tok/s Pro
2000 character limit reached

Post Reinforcement Learning Inference (2302.08854v4)

Published 17 Feb 2023 in stat.ML, cs.LG, and econ.EM

Abstract: We consider estimation and inference using data collected from reinforcement learning algorithms. These algorithms, characterized by their adaptive experimentation, interact with individual units over multiple stages, dynamically adjusting their strategies based on previous interactions. Our goal is to evaluate a counterfactual policy post-data collection and estimate structural parameters, like dynamic treatment effects, which can be used for credit assignment and determining the effect of earlier actions on final outcomes. Such parameters of interest can be framed as solutions to moment equations, but not minimizers of a population loss function, leading to Z-estimation approaches for static data. However, in the adaptive data collection environment of reinforcement learning, where algorithms deploy nonstationary behavior policies, standard estimators do not achieve asymptotic normality due to the fluctuating variance. We propose a weighted Z-estimation approach with carefully designed adaptive weights to stabilize the time-varying estimation variance. We identify proper weighting schemes to restore the consistency and asymptotic normality of the weighted Z-estimators for target parameters, which allows for hypothesis testing and constructing uniform confidence regions. Primary applications include dynamic treatment effect estimation and dynamic off-policy evaluation.

Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube