Systematic study of forgetting metrics in continual learning

Develop a systematic framework for evaluating forgetting in continual robot policy learning by formalizing, analyzing, and validating forgetting metrics beyond standard Negative Backward Transfer (NBT), including normalized variants that correct for differences in initial task success rates on benchmarks such as LIBERO-10.

Background

The paper shows that standard Negative Backward Transfer (NBT) can penalize higher-performing policies more heavily, since a drop from a higher initial success rate contributes a larger absolute value to NBT even when both policies fully forget. To mitigate this, the authors introduce a normalized NBT that scales forgetting by the initial success rate, and present empirical comparisons on LIBERO-10.

Despite proposing a normalization, the authors explicitly note that a more comprehensive and principled assessment of forgetting metrics is needed to fairly compare pretrained Vision-Language-Action models and non-pretrained baselines across buffer sizes and task diversity.

References

We leave a more systematic study of forgetting metrics to future work.

Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning  (2603.03818 - Liu et al., 4 Mar 2026) in Appendix A.2 (LIBERO-10 Continual Learning Results)