Complete treatment of bounded ES for time-varying-goal manipulation
Develop a complete theoretical treatment of bounded extremum seeking for the time-varying-goal versions of the pushing and pick-and-place tasks used in the ES-DRL controller, explicitly accounting for the additional terms induced by the goal’s rate of change after the RL-to-ES handoff.
References
A similar argument applies to the time-varying-goal pushing and pick-and-place tasks, where additional terms arise due to the rate of change of the goal. A complete treatment of that case is beyond the scope of this paper and is left for future work.
— Deep Reinforcement Learning for Robotic Manipulation under Distribution Shift with Bounded Extremum Seeking
(2604.01142 - Saxena et al., 1 Apr 2026) in Section 4.3 (Supervisor), following Proposition (Sketch of Proof)