Investigating the dual-V formulation of ReCOIL
Investigate the dual-V formulation of ReCOIL for off-policy imitation learning, including its properties and practical utility.
References
We also present the dual-V form for ReCOIL in Appendix~\ref{ap:closer} but defer its investigation for future work.
— Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
(2302.08560 - Sikchi et al., 2023) in Section 4 (ReCOIL: Imitation Learning from Arbitrary Experience)