Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Interactive Continual Learning: Fast and Slow Thinking (2403.02628v2)

Published 5 Mar 2024 in cs.CV and cs.LG

Abstract: Advanced life forms, sustained by the synergistic interaction of neural cognitive mechanisms, continually acquire and transfer knowledge throughout their lifespan. In contrast, contemporary machine learning paradigms exhibit limitations in emulating the facets of continual learning (CL). Nonetheless, the emergence of LLMs presents promising avenues for realizing CL via interactions with these models. Drawing on Complementary Learning System theory, this paper presents a novel Interactive Continual Learning (ICL) framework, enabled by collaborative interactions among models of various sizes. Specifically, we assign the ViT model as System1 and multimodal LLM as System2. To enable the memory module to deduce tasks from class information and enhance Set2Set retrieval, we propose the Class-Knowledge-Task Multi-Head Attention (CKT-MHA). Additionally, to improve memory retrieval in System1 through enhanced geometric representation, we introduce the CL-vMF mechanism, based on the von Mises-Fisher (vMF) distribution. Meanwhile, we introduce the von Mises-Fisher Outlier Detection and Interaction (vMF-ODI) strategy to identify hard examples, thus enhancing collaboration between System1 and System2 for complex reasoning realization. Comprehensive evaluation of our proposed ICL demonstrates significant resistance to forgetting and superior performance relative to existing methods. Code is available at github.com/ICL.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV), pages 139–154, 2018.
  2. Clustering on the unit hypersphere using von mises-fisher distributions. J. Mach. Learn. Res., 6:1345–1382, 2005.
  3. Dark experience for general continual learning: a strong, simple baseline. Advances in neural information processing systems, 33:15920–15930, 2020.
  4. Rethinking experience replay: a bag of tricks for continual learning. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 2180–2187. IEEE, 2021.
  5. A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv preprint arXiv:2303.04226, 2023.
  6. Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420, 2018.
  7. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019.
  8. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  9. Jonathan St BT Evans. In two minds: dual-process accounts of reasoning. Trends in cognitive sciences, 7(10):454–459, 2003.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  11. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8349, 2021.
  12. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pages 831–839, 2019.
  13. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017.
  14. Learning multiple layers of features from tiny images. 2009.
  15. What learning systems do intelligent agents need? complementary learning systems theory updated. Trends in cognitive sciences, 20(7):512–534, 2016.
  16. Continual learning with extended kronecker-factored approximate curvature. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9001–9010, 2020.
  17. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3:211–225, 2015.
  18. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
  19. Rotate your networks: Better weight consolidation and less catastrophic forgetting. In 2018 24th International Conference on Pattern Recognition (ICPR), pages 2262–2268. IEEE, 2018.
  20. Packnet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7765–7773, 2018.
  21. Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In Proceedings of the European conference on computer vision (ECCV), pages 67–82, 2018.
  22. Weakly-supervised hierarchical text classification. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 6826–6833, 2019.
  23. The stability-plasticity dilemma: Investigating the continuum from catastrophic forgetting to age-limited learning effects, 2013.
  24. Bilateral memory consolidation for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16026–16035, 2023.
  25. Hippocampal and neocortical contributions to memory: Advances in the complementary learning systems framework. Trends in cognitive sciences, 6(12):505–510, 2002.
  26. Continual learning by asymmetric loss approximation with single-side overestimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3335–3344, 2019.
  27. Dualnet: Continual learning, fast and slow. Advances in Neural Information Processing Systems, 34:16131–16144, 2021.
  28. Improving robustness of intent detection under adversarial attacks: A geometric constraint perspective. IEEE transactions on neural networks and learning systems, PP, 2023a.
  29. Improving robustness of intent detection under adversarial attacks: A geometric constraint perspective. IEEE Transactions on Neural Networks and Learning Systems, 2023b.
  30. Bns: Building network structures dynamically for continual learning. Advances in Neural Information Processing Systems, 34:20608–20620, 2021.
  31. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
  32. The persistence and transience of memory. Neuron, 94(6):1071–1084, 2017.
  33. Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv preprint arXiv:1810.11910, 2018.
  34. Forgetting as a form of adaptive engram cell plasticity. Nature Reviews Neuroscience, 23(3):173–186, 2022.
  35. Gradient projection memory for continual learning. arXiv preprint arXiv:2103.09762, 2021.
  36. Sparse coding in a dual memory system for lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 9714–9722, 2023.
  37. Chatgpt: Optimizing language models for dialogue. OpenAI blog, 2022.
  38. Advani Sun, Weinan et al. Organizing memories for generalization in complementary learning systems. Nature neuroscience, 26(8):1438–1448, 2023.
  39. Layerwise optimization by gradient decomposition for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9634–9643, 2021.
  40. Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734, 2019.
  41. A comprehensive survey of continual learning: Theory, method and application. arXiv preprint arXiv:2302.00487, 2023.
  42. Continual learning with lifelong vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 171–181, 2022a.
  43. Online continual learning with contrastive vision transformer. In ECCV, 2022b.
  44. Dualprompt: Complementary prompting for rehearsal-free continual learning. In ECCV, pages 631–648, 2022c.
  45. Learning to prompt for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 139–149, 2022d.
  46. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  47. Memory consolidation or transformation: context manipulation and hippocampal representations of memory. Nature neuroscience, 10(5):555–557, 2007.
  48. Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3014–3023, 2021.
  49. Continual learning through synaptic intelligence. In International Conference on Machine Learning, pages 3987–3995. PMLR, 2017.
  50. Triovecevent: Embedding-based online local event detection in geo-tagged tweet streams. pages 595–604, 2017.
  51. Infmllm: A unified framework for visual-language tasks, 2023.
  52. Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Biqing Qi (37 papers)
  2. Xingquan Chen (2 papers)
  3. Junqi Gao (17 papers)
  4. Jianxing Liu (12 papers)
  5. Ligang Wu (10 papers)
  6. Bowen Zhou (141 papers)
  7. Dong Li (429 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.