Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Rehearsal Free Zero Forgetting Continual Learning using Adaptive Weight Modulation (2311.15276v1)

Published 26 Nov 2023 in cs.CV and cs.LG

Abstract: Artificial neural networks encounter a notable challenge known as continual learning, which involves acquiring knowledge of multiple tasks over an extended period. This challenge arises due to the tendency of previously learned weights to be adjusted to suit the objectives of new tasks, resulting in a phenomenon called catastrophic forgetting. Most approaches to this problem seek a balance between maximizing performance on the new tasks and minimizing the forgetting of previous tasks. In contrast, our approach attempts to maximize the performance of the new task, while ensuring zero forgetting. This is accomplished by creating a task-specific modulation parameters for each task. Only these would be learnable parameters during learning of consecutive tasks. Through comprehensive experimental evaluations, our model demonstrates superior performance in acquiring and retaining novel tasks that pose difficulties for other multi-task models. This emphasizes the efficacy of our approach in preventing catastrophic forgetting while accommodating the acquisition of new tasks

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Robert M French, “Catastrophic forgetting in connectionist networks,” Trends in cognitive sciences 3,(4):128–135, 1999.
  2. “Catastrophic interference in connectionist networks: The sequential learning problem.,” The psychology of learning and motivation, 24(109-165):92, 1989.
  3. “Measuring catastrophic forgetting in neural networks,” CoRR, vol. abs/1708.02072, 2017.
  4. “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  5. “Mobilenetv2: Inverted residuals and linear bottlenecks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510–4520.
  6. “Searching for mobilenetv3,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1314–1324.
  7. “Plop: Learning without forgetting for continual semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 4040–4050.
  8. “Representation compensation networks for continual semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7053–7064.
  9. “Pathnet: Evolution channels gradient descent in super neural networks,” arXiv preprint arXiv:1701.08734, 2017.
  10. “Progressive neural networks,” arXiv preprint arXiv:1606.04671, 2016.
  11. “Extending conditional convolution structures for enhancing multitasking continual learning,” in 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2020, pp. 1605–1610.
  12. “Piggyback: Adapting a single network to multiple tasks by learning to mask weights,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 67–82.
  13. “Linear mode connectivity and the lottery ticket hypothesis,” in International Conference on Machine Learning. PMLR, 2020, pp. 3259–3269.
  14. “Deconstructing lottery tickets: Zeros, signs, and the supermask,” Advances in neural information processing systems, vol. 32, 2019.
  15. “Supermasks in superposition,” Advances in Neural Information Processing Systems, vol. 33, pp. 15173–15184, 2020.
  16. “Weight agnostic neural networks,” Advances in neural information processing systems, vol. 32, 2019.
  17. Michael Crawshaw, “Multi-task learning with deep neural networks: A survey,” arXiv preprint arXiv:2009.09796, 2020.
  18. “Geonet: Geometric neural network for joint depth and surface normal estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 283–291.
  19. “Real-time joint semantic segmentation and depth estimation using asymmetric annotations,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 7101–7107.
  20. “Gradient episodic memory for continual learning,” Advances in neural information processing systems, vol. 30, 2017.
  21. “Riemannian walk for incremental learning: Understanding forgetting and intransigence,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 532–547.
  22. Bruce L McNaughton James L McClelland and Randall C O’Reilly, “Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory.,” 1995.
  23. “Overcoming catastrophic forgetting in neural networks,” Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
  24. “Caltech-ucsd birds 200,” Tech. Rep., 2011.
  25. Mita T. Wah C. Schroff F. Belongie S. Perona P. Welinder P., Branson S., ““caltech-ucsd birds 200”,” Tech. Rep., 2010.
  26. “Automated flower classification over a large number of classes,” in 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing. IEEE, 2008, pp. 722–729.
  27. “Parameter-efficient transfer learning for nlp,” in International Conference on Machine Learning. PMLR, 2019, pp. 2790–2799.
  28. “How to fine-tune bert for text classification?,” in China national conference on Chinese computational linguistics. Springer, 2019, pp. 194–206.
  29. “Vlp: A survey on vision-language pre-training,” arXiv preprint arXiv:2202.09061, 2022.
  30. “A new image classification method using cnn transfer learning and web data augmentation,” Expert Systems with Applications, vol. 95, pp. 43–56, 2018.
  31. “Lora: Low-rank adaptation of large language models,” CoRR, vol. abs/2106.09685, 2021.
  32. “Packnet: Adding multiple tasks to a single network by iterative pruning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018, pp. 7765–7773.
  33. “Ternary feature masks: zero-forgetting for task-incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3570–3579.
  34. “Gdumb: A simple approach that questions our progress in continual learning,” in European conference on computer vision. Springer, 2020, pp. 524–540.
  35. “Continual learning through synaptic intelligence,” in Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, Doina Precup and Yee Whye Teh, Eds. 2017, vol. 70 of Proceedings of Machine Learning Research, pp. 3987–3995, PMLR.
  36. “Memory aware synapses: Learning what (not) to forget,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 139–154.
  37. “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009, pp. 248–255.
  38. “3d object representations for fine-grained categorization,” in Proceedings of the IEEE international conference on computer vision workshops, 2013, pp. 554–561.
  39. “How do humans sketch objects?,” ACM Trans. Graph. (Proc. SIGGRAPH), vol. 31, no. 4, pp. 44:1–44:10, 2012.
  40. “Improved artgan for conditional synthesis of natural image and artwork,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 394–409, 2019.
  41. “Food-101 – mining discriminative components with random forests,” in European Conference on Computer Vision, 2014.
  42. “Learning without forgetting,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
  43. “Anatomy of hierarchy: feedforward and feedback pathways in macaque visual cortex,” Journal of Comparative Neurology, vol. 522, no. 1, pp. 225–259, 2014.
  44. “Hierarchical organization and functional streams in the visual cortex,” Trends in neurosciences, vol. 6, pp. 370–375, 1983.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Yonatan Sverdlov (4 papers)
  2. Shimon Ullman (32 papers)

Summary

We haven't generated a summary for this paper yet.