Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
103 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
50 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Hyperparameters in Continual Learning: A Reality Check (2403.09066v4)

Published 14 Mar 2024 in cs.LG and cs.CV

Abstract: Continual learning (CL) aims to train a model on a sequence of tasks (i.e., a CL scenario) while balancing the trade-off between plasticity (learning new tasks) and stability (retaining prior knowledge). The dominantly adopted conventional evaluation protocol for CL algorithms selects the best hyperparameters (e.g., learning rate, mini-batch size, regularization strengths, etc.) within a given scenario and then evaluates the algorithms using these hyperparameters in the same scenario. However, this protocol has significant shortcomings: it overestimates the CL capacity of algorithms and relies on unrealistic hyperparameter tuning, which is not feasible for real-world applications. From the fundamental principles of evaluation in machine learning, we argue that the evaluation of CL algorithms should focus on assessing the generalizability of their CL capacity to unseen scenarios. Based on this, we propose the Generalizable Two-phase Evaluation Protocol (GTEP) consisting of hyperparameter tuning and evaluation phases. Both phases share the same scenario configuration (e.g., number of tasks) but are generated from different datasets. Hyperparameters of CL algorithms are tuned in the first phase and applied in the second phase to evaluate the algorithms. We apply this protocol to class-incremental learning, both with and without pretrained models. Across more than 8,000 experiments, our results show that most state-of-the-art algorithms fail to replicate their reported performance, highlighting that their CL capacity has been significantly overestimated in the conventional evaluation protocol. Our implementation can be found in https://github.com/csm9493/GTEP.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV), pages 139–154, 2018.
  2. {CPR}: Classifier-projection regularization for continual learning. In International Conference on Learning Representations, 2021.
  3. Rebalancing batch normalization for exemplar-based class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20127–20136, 2023a.
  4. Towards more objective evaluation of class incremental learning: Representation learning perspective, 2023b.
  5. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision (ECCV), pages 532–547, 2018a.
  6. Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420, 2018b.
  7. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  9. Podnet: Pooled outputs distillation for small-tasks incremental learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pages 86–102. Springer, 2020.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  11. Learning multiple layers of features from tiny images. 2009.
  12. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
  13. Online hyperparameter optimization for class-incremental learning. arXiv preprint arXiv:2301.05032, 2023.
  14. Class-incremental learning: survey and performance evaluation on image classification. arXiv preprint arXiv:2010.15277, 2020.
  15. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects. Frontiers in psychology, 4:504, 2013.
  16. Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
  17. Computationally budgeted continual learning: What does matter? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3698–3707, 2023.
  18. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
  19. Progress & compress: A scalable framework for continual learning. In International conference on machine learning, pages 4528–4537. PMLR, 2018.
  20. Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734, 2019.
  21. Beef: Bi-compatible class-incremental learning via energy-based expansion and fusion. In The Eleventh International Conference on Learning Representations, 2022a.
  22. Foster: Feature boosting and compression for class-incremental learning. In European conference on computer vision, pages 398–414. Springer, 2022b.
  23. A comprehensive survey of continual learning: Theory, method and application. arXiv preprint arXiv:2302.00487, 2023.
  24. Large scale incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 374–382, 2019.
  25. Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3014–3023, 2021.
  26. Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547, 2017.
  27. Maintaining discrimination and fairness in class incremental learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13208–13217, 2020.
  28. A model or 603 exemplars: Towards memory-efficient class-incremental learning. arXiv preprint arXiv:2205.13218, 2022.
  29. Pycil: a python toolbox for class-incremental learning. SCIENCE CHINA Information Sciences, 66(9):197101–, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.