Optimal Weight Selection for Weighted Goal Loss in Gradient-Based Planning

Determine the optimal sequence of weights {w_i}_{i=2}^{H+1} in the Weighted Goal Loss objective used for gradient-based planning with latent world models f_θ and a fixed encoder Φ_μ. Specifically, select the weighting scheme for L_WGL, defined as the average over timesteps of w_i times the squared L2 distance between the predicted latent state at timestep i and the goal latent state, so as to most effectively guide optimization of action sequences under a planning horizon H.

Background

The paper introduces a Weighted Goal Loss (WGL) objective to improve gradient-based planning (GBP) by encouraging intermediate predicted latent states to approach the goal latent state, rather than focusing solely on the terminal state. The authors found that different tasks benefited from different temporal weighting strategies: later-state upweighting for navigation tasks such as PointMaze and Wall, and earlier-state upweighting for the manipulation task PushT.

Despite these empirical observations, the authors explicitly state that determining the optimal sequence of weights remains unresolved and is left for future work. This open problem seeks a principled and general method to select or learn the weighting sequence that best stabilizes and accelerates GBP across planning horizons and task domains.

References

We leave the optimal selection of this sequence of weights as future work.

Closing the Train-Test Gap in World Models for Gradient-Based Planning (2512.09929 - Parthasarathy et al., 10 Dec 2025) in Appendix, Section "Additional Experiment Results", Subsection "Weighted Goal Loss"