Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

IF2Net: Innately Forgetting-Free Networks for Continual Learning (2306.10480v1)

Published 18 Jun 2023 in cs.LG and cs.CV

Abstract: Continual learning can incrementally absorb new concepts without interfering with previously learned knowledge. Motivated by the characteristics of neural networks, in which information is stored in weights on connections, we investigated how to design an Innately Forgetting-Free Network (IF2Net) for continual learning context. This study proposed a straightforward yet effective learning paradigm by ingeniously keeping the weights relative to each seen task untouched before and after learning a new task. We first presented the novel representation-level learning on task sequences with random weights. This technique refers to tweaking the drifted representations caused by randomization back to their separate task-optimal working states, but the involved weights are frozen and reused (opposite to well-known layer-wise updates of weights). Then, sequential decision-making without forgetting can be achieved by projecting the output weight updates into the parsimonious orthogonal space, making the adaptations not disturb old knowledge while maintaining model plasticity. IF2Net allows a single network to inherently learn unlimited mapping rules without telling task identities at test time by integrating the respective strengths of randomization and orthogonalization. We validated the effectiveness of our approach in the extensive theoretical analysis and empirical study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (87)
  1. S. Thrun and T. M. Mitchell, “Lifelong robot learning,” Robotics and Autonomous Systems, vol. 15, no. 1-2, pp. 25–46, 1995.
  2. M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” in Psychology of Learning and Motivation.   Elsevier, 1989, vol. 24, pp. 109–165.
  3. I. J. Goodfellow, M. Mirza, D. Xiao, A. Courville, and Y. Bengio, “An empirical investigation of catastrophic forgetting in gradient-based neural networks,” arXiv preprint arXiv:1312.6211, 2013.
  4. G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,” Neural Networks, vol. 113, pp. 54–71, 2019.
  5. M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3366–3385, 2021.
  6. Z. Wang, C. Chen, and D. Dong, “A dirichlet process mixture of robust task models for scalable lifelong reinforcement learning,” IEEE Transactions on Cybernetics, 2022, doi: 10.1109/TCYB.2022.3170485.
  7. B. Yang, M. Lin, Y. Zhang, B. Liu, X. Liang, R. Ji, and Q. Ye, “Dynamic support network for few-shot class incremental learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, doi: 10.1109/TPAMI.2022.3175849.
  8. X. Zhang, T. Zhao, J. Chen, Y. Shen, and X. Li, “Epicker is an exemplar-based continual learning approach for knowledge accumulation in cryoem particle picking,” Nature Communications, vol. 13, no. 1, pp. 1–10, 2022.
  9. S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010.
  10. D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 6470–6479.
  11. R. Shokri and V. Shmatikov, “Privacy-preserving deep learning,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015, pp. 1310–1321.
  12. R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory aware synapses: Learning what (not) to forget,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 139–154.
  13. G. M. Van de Ven and A. S. Tolias, “Generative replay with feedback connections as a general strategy for continual learning,” arXiv preprint arXiv:1809.10635, 2018.
  14. M. Zhai, L. Chen, F. Tung, J. He, M. Nawhal, and G. Mori, “Lifelong GAN: Continual learning for conditional image generation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2759–2768.
  15. L. Wang, B. Lei, Q. Li, H. Su, J. Zhu, and Y. Zhong, “Triple-memory networks: A brain-inspired method for continual learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 5, pp. 1925–1934, 2021.
  16. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,” Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
  17. F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” in International Conference on Machine Learning, 2017, pp. 3987–3995.
  18. J. Zhang, J. Zhang, S. Ghosh, D. Li, S. Tasci, L. Heck, H. Zhang, and C.-C. J. Kuo, “Class-incremental learning via deep model consolidation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 1131–1140.
  19. A. Chaudhry, A. Gordo, P. Dokania, P. Torr, and D. Lopez-Paz, “Using hindsight to anchor past knowledge in continual learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 6993–7001.
  20. V. K. Verma, K. J. Liang, N. Mehta, P. Rai, and L. Carin, “Efficient feature transformations for discriminative and generative continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13 865–13 875.
  21. J. Xu, J. Ma, X. Gao, and Z. Zhu, “Adaptive progressive continual learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6715–6728, 2022.
  22. M. Perkonigg, J. Hofmanninger, C. J. Herold, J. A. Brink, O. Pianykh, H. Prosch, and G. Langs, “Dynamic memory to alleviate catastrophic forgetting in continual learning with medical imaging,” Nature Communications, vol. 12, no. 1, pp. 1–12, 2021.
  23. W. Hu, Q. Qin, M. Wang, J. Ma, and B. Liu, “Continual learning by using information of each class holistically,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 7797–7805.
  24. J. L. McClelland, B. L. McNaughton, and R. C. O’Reilly, “Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory,” Psychological Review, vol. 102, no. 3, pp. 419–457, 1995.
  25. Z. Li and D. Hoiem, “Learning without forgetting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
  26. Y.-M. Tang, Y.-X. Peng, and W.-S. Zheng, “Learning to imagine: Diversify memory for incremental learning using unlabeled data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9549–9558.
  27. B. Zhang, Y. Guo, Y. Li, Y. He, H. Wang, and Q. Dai, “Memory recall: A simple neural network training framework against catastrophic forgetting,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 5, pp. 2010–2022, 2022.
  28. X. He and H. Jaeger, “Overcoming catastrophic interference using conceptor-aided backpropagation,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=B1al7jg0b
  29. I. Paik, S. Oh, T. Kwak, and I. Kim, “Overcoming catastrophic forgetting by neuron-level plasticity control,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, 2020, pp. 5339–5346.
  30. Q. Gao, Z. Luo, D. Klabjan, and F. Zhang, “Efficient architecture search for continual learning,” IEEE Transactions on Neural Networks and Learning Systems, 2022, doi: 10.1109/TNNLS.2022.3151511.
  31. Z. Zhang, Y. Chen, and C. Zhou, “Self-growing binary activation network: A novel deep learning model with dynamic architecture,” IEEE Transactions on Neural Networks and Learning Systems, 2022, doi: 10.1109/TNNLS.2022.3176027.
  32. J. Rajasegaran, M. Hayat, S. Khan, F. S. Khan, and L. Shao, “Random path selection for incremental learning,” in Advances in Neural Information Processing Systems, 2019, pp. 12 669–12 679.
  33. S. Wen, A. Rios, Y. Ge, and L. Itti, “Beneficial perturbation network for designing general adaptive artificial intelligence systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 8, pp. 3778–3791, 2008.
  34. H. Jung, J. Ju, M. Jung, and J. Kim, “Less-forgetting learning in deep neural networks,” arXiv preprint arXiv:1607.00122, 2016.
  35. T. Lesort, V. Lomonaco, A. Stoian, D. Maltoni, D. Filliat, and N. Dıaz-Rodrıguez, “Continual learning for robotics,” arXiv preprint arXiv:1907.00182, pp. 1–34, 2019.
  36. E. Belouadah and A. Popescu, “IL2M: Class incremental learning with dual memory,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 583–592.
  37. J. Bang, H. Kim, Y. Yoo, J.-W. Ha, and J. Choi, “Rainbow memory: Continual learning with a memory of diverse samples,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8218–8227.
  38. C. V. Nguyen, Y. Li, T. D. Bui, and R. E. Turner, “Variational continual learning,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=BkQqq0gRb
  39. S. Swaroop, C. V. Nguyen, T. D. Bui, and R. E. Turner, “Improving and understanding variational continual learning,” arXiv preprint arXiv:1905.02099, 2019.
  40. N. Loo, S. Swaroop, and R. E. Turner, “Generalized variational continual learning,” arXiv preprint arXiv:2011.12328, 2020.
  41. A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,” arXiv preprint arXiv:1606.04671, 2016.
  42. J. Yoon, S. Kim, E. Yang, and S. J. Hwang, “Scalable and order-robust continual learning with additive parameter decomposition,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=r1gdj2EKPB
  43. W. Hu, M. Wang, Q. Qin, J. Ma, and B. Liu, “Hrn: A holistic approach to one class learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 19 111–19 124, 2020.
  44. Z. Wang, Z. Zhang, C.-Y. Lee, H. Zhang, R. Sun, X. Ren, G. Su, V. Perot, J. Dy, and T. Pfister, “Learning to prompt for continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 139–149.
  45. Y.-H. Pao, G.-H. Park, and D. J. Sobajic, “Learning and generalization characteristics of the random vector functional-link net,” Neurocomputing, vol. 6, no. 2, pp. 163–180, 1994.
  46. Y.-H. Pao and Y. Takefuji, “Functional-link net computing: theory, system architecture, and functionalities,” Computer, vol. 25, no. 5, pp. 76–79, 1992.
  47. P. N. Suganthan and R. Katuwal, “On the origins of randomization-based feedforward neural networks,” Applied Soft Computing, vol. 105, p. 107239, 2021.
  48. L. Zhang and P. N. Suganthan, “Visual tracking with convolutional random vector functional link network,” IEEE Transactions on Cybernetics, vol. 47, no. 10, pp. 3243–3253, 2016.
  49. D. Wang and M. Li, “Stochastic configuration networks: Fundamentals and algorithms,” IEEE Transactions on Cybernetics, vol. 47, no. 10, pp. 3466–3479, 2017.
  50. C. L. P. Chen and Z. Liu, “Broad learning system: An effective and efficient incremental learning system without the need for deep architecture,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 1, pp. 10–24, 2017.
  51. Q. Wang, W. Dai, P. Lin, and P. Zhou, “Compact incremental random weight network for estimating the underground airflow quantity,” IEEE Transactions on Industrial Informatics, vol. 18, no. 1, pp. 426–436, 2021.
  52. C. Huang, M. Li, F. Cao, H. Fujita, Z. Li, and X. Wu, “Are graph convolutional networks with random weights feasible?” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, doi: 10.1109/TPAMI.2022.3183143.
  53. L. Zhang and P. N. Suganthan, “A comprehensive evaluation of random vector functional link networks,” Information Sciences, vol. 367, pp. 1094–1105, 2016.
  54. S. Scardapane and D. Wang, “Randomness in neural networks: an overview,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 2, p. e1200, 2017.
  55. W. Cao, X. Wang, Z. Ming, and J. Gao, “A review on neural networks with random weights,” Neurocomputing, vol. 275, pp. 278–287, 2018.
  56. X. Gong, T. Zhang, C. L. P. Chen, and Z. Liu, “Research review for broad learning system: algorithms, theory, and applications,” IEEE Transactions on Cybernetics, vol. 52, no. 9, pp. 8922–8950, 2022.
  57. G.-B. Huang, Q. Zhu, and C. K. Siew, “Extreme learning machine: theory and applications,” Neurocomputing, vol. 70, no. 1-3, pp. 489–501, 2006.
  58. M. Li and D. Wang, “Insights into randomized algorithms for neural networks: Practical issues and common pitfalls,” Information Sciences, vol. 382, pp. 170–178, 2017.
  59. W. Dai, D. Li, P. Zhou, and T. Chai, “Stochastic configuration networks with block increments for data modeling in process industries,” Information Sciences, vol. 484, pp. 367–386, 2019.
  60. W. Dai, X. Zhou, D. Li, S. Zhu, and X. Wang, “Hybrid parallel stochastic configuration networks for industrial data analytics,” IEEE Transactions on Industrial Informatics, vol. 18, no. 4, pp. 2331–2341, 2021.
  61. S. Singhal and L. Wu, “Training feed-forward networks with the extended kalman algorithm,” in International Conference on Acoustics, Speech, and Signal Processing.   IEEE, 1989, pp. 1187–1190.
  62. G. Zeng, Y. Chen, B. Cui, and S. Yu, “Continual learning of context-dependent processing in neural networks,” Nature Machine Intelligence, vol. 1, no. 8, pp. 364–372, 2019.
  63. X. Li and W. Wang, “GopGAN: Gradients orthogonal projection generative adversarial network with continual learning,” IEEE Transactions on Neural Networks and Learning Systems, 2021, doi: 10.1109/TNNLS.2021.3093319.
  64. Y.-C. Hsu, Y.-C. Liu, A. Ramasamy, and Z. Kira, “Re-evaluating continual learning scenarios: A categorization and case for strong baselines,” arXiv preprint arXiv:1810.12488, 2018.
  65. M. Masana, X. Liu, B. Twardowski, M. Menta, A. D. Bagdanov, and J. van de Weijer, “Class-incremental learning: survey and performance evaluation on image classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, doi: 10.1109/TPAMI.2022.3213473.
  66. G. M. Van de Ven and A. S. Tolias, “Three scenarios for continual learning,” arXiv preprint arXiv:1904.07734, 2019.
  67. P. Pan, S. Swaroop, A. Immer, R. Eschenhagen, R. Turner, and M. E. E. Khan, “Continual deep learning by functional regularisation of memorable past,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 4453–4464.
  68. D. Needell, A. A. Nelson, R. Saab, and P. Salanevich, “Random vector functional link networks for function approximation on manifolds,” arXiv preprint arXiv:2007.15776, 2020.
  69. J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” Advances in Neural Information Processing Systems, vol. 27, 2014.
  70. T. L. Hayes, K. Kafle, R. Shrestha, M. Acharya, and C. Kanan, “Remind your neural network to prevent catastrophic forgetting,” in European Conference on Computer Vision.   Springer, 2020, pp. 466–483.
  71. T.-Y. Wu, G. Swaminathan, Z. Li, A. Ravichandran, N. Vasconcelos, R. Bhotika, and S. Soatto, “Class-incremental learning with strong pre-trained models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9601–9610.
  72. P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in International Conference on Machine Learning, 2008, pp. 1096–1103.
  73. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of Machine Learning Research, vol. 9, no. 11, 2008.
  74. B. Xu, Q. Liu, and T. Huang, “A discrete-time projection neural network for sparse signal reconstruction with application to face recognition,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 1, pp. 151–162, 2018.
  75. Q. Liu and J. Wang, “L1subscript𝐿1{L}_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT -minimization algorithms for sparse signal reconstruction based on a projection neural network,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 3, pp. 698–707, 2015.
  76. A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
  77. Y. Engel, S. Mannor, and R. Meir, “The kernel recursive least-squares algorithm,” IEEE Transactions on Signal Processing, vol. 52, no. 8, pp. 2275–2285, 2004.
  78. F. Huszár, “On quadratic penalties in elastic weight consolidation,” arXiv preprint arXiv:1712.03847, 2017.
  79. S. M. Kakade, K. Sridharan, and A. Tewari, “On the complexity of linear prediction: Risk bounds, margin bounds, and regularization,” in Advances in Neural Information Processing Systems, vol. 21, 2008, pp. 793–800.
  80. P. L. Bartlett and S. Mendelson, “Rademacher and gaussian complexities: Risk bounds and structural results,” Journal of Machine Learning Research, vol. 3, no. Nov, pp. 463–482, 2002.
  81. N. Golowich, A. Rakhlin, and O. Shamir, “Size-independent sample complexity of neural networks,” in Conference on Learning Theory.   PMLR, 2018, pp. 297–299.
  82. G. M. van de Ven, H. T. Siegelmann, and A. S. Tolias, “Brain-inspired replay for continual learning with artificial neural networks,” Nature Communications, vol. 11, no. 1, pp. 1–14, 2020.
  83. S. Tang, D. Chen, J. Zhu, S. Yu, and W. Ouyang, “Layerwise optimization by gradient decomposition for continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9634–9643.
  84. D. Deng, G. Chen, J. Hao, Q. Wang, and P.-A. Heng, “Flattening sharpness for dynamic gradient projection memory benefits continual learning,” Advances in Neural Information Processing Systems, vol. 34, pp. 18 710–18 721, 2021.
  85. R. Wang, Y. Bao, B. Zhang, J. Liu, W. Zhu, and G. Guo, “Anti-retroactive interference for lifelong learning,” arXiv preprint arXiv:2208.12967, 2022.
  86. J. Schwarz, W. Czarnecki, J. Luketina, A. Grabska-Barwinska, Y. W. Teh, R. Pascanu, and R. Hadsell, “Progress & compress: A scalable framework for continual learning,” in International Conference on Machine Learning, 2018, pp. 4528–4537.
  87. Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, and Y. Fu, “Large scale incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.