Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Active Continual Learning: On Balancing Knowledge Retention and Learnability (2305.03923v2)

Published 6 May 2023 in cs.LG and cs.CL

Abstract: Acquiring new knowledge without forgetting what has been learned in a sequence of tasks is the central focus of continual learning (CL). While tasks arrive sequentially, the training data are often prepared and annotated independently, leading to the CL of incoming supervised learning tasks. This paper considers the under-explored problem of active continual learning (ACL) for a sequence of active learning (AL) tasks, where each incoming task includes a pool of unlabelled data and an annotation budget. We investigate the effectiveness and interplay between several AL and CL algorithms in the domain, class and task-incremental scenarios. Our experiments reveal the trade-off between two contrasting goals of not forgetting the old knowledge and the ability to quickly learn new knowledge in CL and AL, respectively. While conditioning the AL query strategy on the annotations collected for the previous tasks leads to improved task performance on the domain and task incremental learning, our proposed forgetting-learning profile suggests a gap in balancing the effect of AL and CL for the class-incremental scenario.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Gradient based sample selection for online continual learning. Advances in neural information processing systems, 32, 2019.
  2. K-means++: The advantages of careful seeding. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp.  1027–1035, 2007.
  3. Deep batch active learning by diverse, uncertain gradient lower bounds. In International Conference on Learning Representations (ICLR) 2020, 2020. URL https://openreview.net/forum?id=ryghZJBKPS.
  4. Learning algorithms for active learning. In Proceedings of the 34th International Conference on Machine Learning, pp.  301–310, 2017.
  5. Klaus Brinker. Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp.  59–66, 2003.
  6. Dark experience for general continual learning: a strong, simple baseline. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.  15920–15930. Curran Associates, Inc., 2020.
  7. Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European Conference on Computer Vision (ECCV), pp.  532–547, 2018.
  8. Efficient lifelong learning with a-GEM. In International Conference on Learning Representations, 2019a. URL https://openreview.net/forum?id=Hkf2_sC5FX.
  9. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019b.
  10. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  11. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp.  4171–4186, 2019. doi: 10.18653/v1/N19-1423. URL https://www.aclweb.org/anthology/N19-1423.
  12. A holistic lexicon-based approach to opinion mining. In Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM ’08, pp.  231–240, New York, NY, USA, 2008. Association for Computing Machinery. ISBN 9781595939272. doi: 10.1145/1341531.1341561. URL https://doi.org/10.1145/1341531.1341561.
  13. Learning how to active learn: A deep reinforcement learning approach. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp.  595–605, 2017. doi: 10.18653/v1/D17-1063. URL https://aclanthology.org/D17-1063.
  14. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  15. Bayesian active learning for classification and preference learning. arXiv preprint arXiv:1112.5745, 2011.
  16. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, pp.  168–177, New York, NY, USA, 2004. Association for Computing Machinery. ISBN 1581138881. doi: 10.1145/1014052.1014073. URL https://doi.org/10.1145/1014052.1014073.
  17. Continual learning by using information of each class holistically. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9):7797–7805, May 2021. doi: 10.1609/aaai.v35i9.16952. URL https://ojs.aaai.org/index.php/AAAI/article/view/16952.
  18. Multi-class active learning for image classification. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp.  2372–2379, 2009.
  19. Achieving forgetting prevention and knowledge transfer in continual learning. Advances in Neural Information Processing Systems, 34:22443–22456, 2021.
  20. Measuring catastrophic forgetting in neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  21. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  22. Learning multiple layers of features from tiny images. 2009.
  23. Ken Lang. Newsweeder: Learning to filter netnews. In Machine Learning Proceedings 1995, pp.  331–339. Elsevier, 1995.
  24. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. doi: 10.1109/5.726791.
  25. Continual few-shot intent detection. In Proceedings of the 29th International Conference on Computational Linguistics, pp.  333–343, Gyeongju, Republic of Korea, October 2022. International Committee on Computational Linguistics. URL https://aclanthology.org/2022.coling-1.26.
  26. Learning how to actively learn: A deep imitation learning approach. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  1874–1883, 2018. URL http://aclweb.org/anthology/P18-1174.
  27. Automated rule selection for aspect extraction in opinion mining. In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pp.  1291–1297. AAAI Press, 2015. ISBN 9781577357384.
  28. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
  29. Active learning to recognize multiple types of plankton. Journal of Machine Learning Research, 2005.
  30. Representational continuity for unsupervised continual learning. In International Conference on Learning Representations, 2021.
  31. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pp.  109–165. Elsevier, 1989.
  32. A wholistic view of continual learning with deep neural networks: Forgotten lessons and the bridge to active and open world learning. arXiv preprint arXiv:2009.01797, 2020.
  33. SemEval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp.  27–35, Dublin, Ireland, August 2014. Association for Computational Linguistics. doi: 10.3115/v1/S14-2004. URL https://aclanthology.org/S14-2004.
  34. Gdumb: A simple approach that questions our progress in continual learning. In The European Conference on Computer Vision (ECCV), August 2020a.
  35. Gdumb: A simple approach that questions our progress in continual learning. In European conference on computer vision, pp.  524–540. Springer, 2020b.
  36. Roger Ratcliff. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological review, 97(2):285, 1990.
  37. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.  2001–2010, 2017.
  38. Learning to learn without forgetting by maximizing transfer and minimizing interference. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=B1gTShAct7.
  39. Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  40. Active hidden markov models for information extraction. In International Symposium on Intelligent Data Analysis, pp.  309–318, 2001.
  41. Active learning for convolutional neural networks: A core-set approach. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=H1aIuk-RW.
  42. Burr Settles. Active Learning. Morgan & Claypool Publishers, 2012. ISBN 1608457257, 9781608457250.
  43. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the conference on empirical methods in natural language processing, pp.  1070–1079, 2008.
  44. Learning how to active learn by dreaming. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.  4091–4101, 2019. URL https://aclanthology.org/P19-1401.
  45. Pretrained language model in continual learning: A comparative study. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=figzpGMrdD.
  46. Cold-start active learning through self-supervised language modeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.  7935–7948, 2020. doi: 10.18653/v1/2020.emnlp-main.637. URL https://aclanthology.org/2020.emnlp-main.637.
  47. Continual learning of context-dependent processing in neural networks. Nature Machine Intelligence, 1(8):364–372, 2019.
  48. A survey of active learning for natural language processing. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Thuy-Trang Vu (23 papers)
  2. Shahram Khadivi (29 papers)
  3. Mahsa Ghorbanali (2 papers)
  4. Dinh Phung (147 papers)
  5. Gholamreza Haffari (141 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.