Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 83 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 20 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 35 tok/s Pro
2000 character limit reached

Selective Attention-based Modulation for Continual Learning (2403.20086v1)

Published 29 Mar 2024 in cs.CV

Abstract: We present SAM, a biologically-plausible selective attention-driven modulation approach to enhance classification models in a continual learning setting. Inspired by neurophysiological evidence that the primary visual cortex does not contribute to object manifold untangling for categorization and that primordial attention biases are still embedded in the modern brain, we propose to employ auxiliary saliency prediction features as a modulation signal to drive and stabilize the learning of a sequence of non-i.i.d. classification tasks. Experimental results confirm that SAM effectively enhances the performance (in some cases up to about twenty percent points) of state-of-the-art continual learning methods, both in class-incremental and task-incremental settings. Moreover, we show that attention-based modulation successfully encourages the learning of features that are more robust to the presence of spurious features and to adversarial attacks than baseline methods. Code is available at: https://github.com/perceivelab/SAM.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand, “What do different evaluation metrics tell us about saliency models?” IEEE transactions on pattern analysis and machine intelligence, 2018.
  2. J. L. McClelland, B. L. McNaughton, and R. C. O’Reilly, “Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory,” Psychol Rev, vol. 102, no. 3, pp. 419–457, Jul 1995.
  3. D. Kumaran, D. Hassabis, and J. L. McClelland, “What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated,” Trends Cogn Sci, vol. 20, no. 7, pp. 512–534, Jul 2016.
  4. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, “Overcoming catastrophic forgetting in neural networks,” Proc Natl Acad Sci U S A, vol. 114, no. 13, pp. 3521–3526, Mar 2017.
  5. D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17.   Red Hook, NY, USA: Curran Associates Inc., 2017, p. 6470–6479.
  6. R. Kemker and C. Kanan, “Fearnet: Brain-inspired model for incremental learning,” arXiv preprint arXiv:1711.10563, 2017.
  7. Q. Pham, C. Liu, and S. Hoi, “Dualnet: Continual learning, fast and slow,” Advances in Neural Information Processing Systems, 2021.
  8. J. J. DiCarlo, D. Zoccolan, and N. C. Rust, “How does the brain solve visual object recognition?” Neuron, 2012.
  9. A. Kohn, “Visual adaptation: physiology, mechanisms, and functional benefits,” J Neurophysiol, 2007.
  10. V. V. Ramasesh, E. Dyer, and M. Raghu, “Anatomy of catastrophic forgetting: Hidden representations and task semantics,” in International Conference on Learning Representations Workshop, 2021.
  11. J. New, L. Cosmides, and J. Tooby, “Category-specific attention for animals reflects ancestral priorities, not expertise,” Proceedings of the National Academy of Sciences, vol. 104, no. 42, pp. 16 598–16 603, 2007. [Online]. Available: https://www.pnas.org/doi/abs/10.1073/pnas.0703913104
  12. A. Borji, “Saliency prediction in the deep learning era: Successes, limitations, and future challenges,” 2018. [Online]. Available: https://arxiv.org/abs/1810.03716
  13. S. Treue and J. C. nez Trujillo, “Feature-based attention influences motion processing gain in macaque visual cortex,” Nature, vol. 399, no. 6736, pp. 575–579, Jun 1999.
  14. J. C. Martinez-Trujillo and S. Treue, “Feature-based attention increases the selectivity of population responses in primate visual cortex,” Curr Biol, vol. 14, no. 9, pp. 744–751, May 2004.
  15. A. Linardos, M. Kümmerer, O. Press, and M. Bethge, “Deepgaze iie: Calibrated prediction in and out-of-domain for state-of-the-art saliency modeling,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 12 919–12 928.
  16. M. Jiang, S. Huang, J. Duan, and Q. Zhao, “Salicon: Saliency in context,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1072–1080.
  17. R. Droste, J. Jiao, and J. A. Noble, “Unified image and video saliency modeling,” in European Conference on Computer Vision, 2020.
  18. M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  19. G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,” Neural Networks, 2019.
  20. M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” Psychology of learning and motivation, 1989.
  21. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,” Proceedings of the National Academy of Sciences, 2017.
  22. F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” in International Conference on Machine Learning, 2017.
  23. J. Schwarz, W. Czarnecki, J. Luketina, A. Grabska-Barwinska, Y. W. Teh, R. Pascanu, and R. Hadsell, “Progress & compress: A scalable framework for continual learning,” in International Conference on Machine Learning, 2018.
  24. A. Mallya and S. Lazebnik, “Packnet: Adding multiple tasks to a single network by iterative pruning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018.
  25. A. Robins, “Catastrophic forgetting, rehearsal and pseudorehearsal,” Connection Science, 1995.
  26. S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017.
  27. P. Buzzega, M. Boschini, A. Porrello, D. Abati, and S. Calderara, “Dark Experience for General Continual Learning: a Strong, Simple Baseline,” in Advances in Neural Information Processing Systems, 2020.
  28. R. Aljundi, K. Kelchtermans, and T. Tuytelaars, “Task-free continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11 254–11 263.
  29. G. M. van de Ven, T. Tuytelaars, and A. S. Tolias, “Three types of incremental learning,” Nature Machine Intelligence, 2022.
  30. Z. Mai, R. Li, J. Jeong, D. Quispe, H. Kim, and S. Sanner, “Online continual learning in image classification: An empirical survey,” 2022.
  31. R. Ratcliff, “Connectionist models of recognition memory: constraints imposed by learning and forgetting functions.” Psychological Review, 1990.
  32. R. Aljundi, M. Lin, B. Goujaud, and Y. Bengio, “Gradient Based Sample Selection for Online Continual Learning,” in Advances in Neural Information Processing Systems, 2019.
  33. A. Chaudhry, A. Gordo, P. Dokania, P. Torr, and D. Lopez-Paz, “Using hindsight to anchor past knowledge in continual learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
  34. M. De Lange and T. Tuytelaars, “Continual prototype evolution: Learning online from non-stationary data streams,” in IEEE International Conference on Computer Vision, 2021.
  35. L. Caccia, R. Aljundi, N. Asadi, T. Tuytelaars, J. Pineau, and E. Belilovsky, “New Insights on Reducing Abrupt Representation Change in Online Continual Learning,” in International Conference on Learning Representations Workshop, 2022.
  36. Z. Mai, R. Li, H. Kim, and S. Sanner, “Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning,” in IEEE International Conference on Computer Vision and Pattern Recognition Workshops, 2021.
  37. Y. Guo, B. Liu, and D. Zhao, “Online continual learning through mutual information maximization,” in International Conference on Machine Learning, 2022.
  38. M. Boschini, L. Bonicelli, A. Porrello, G. Bellitto, M. Pennisi, S. Palazzo, C. Spampinato, and S. Calderara, “Transfer without forgetting,” in European Conference on Computer Vision, 2022.
  39. A. Zador, S. Escola, B. Richards, B. Ölveczky, Y. Bengio, K. Boahen, M. Botvinick, D. Chklovskii, A. Churchland, C. Clopath, J. DiCarlo, S. Ganguli, J. Hawkins, K. Koerding, A. Koulakov, Y. LeCun, T. Lillicrap, A. Marblestone, B. Olshausen, A. Pouget, C. Savin, T. Sejnowski, E. Simoncelli, S. Solla, D. Sussillo, A. S. Tolias, and D. Tsao, “Toward next-generation artificial intelligence: Catalyzing the neuroai revolution,” arXiv preprint, 2022. [Online]. Available: https://arxiv.org/abs/2210.08340
  40. S. Ebrahimi, S. Petryk, A. Gokul, W. Gan, J. E. Gonzalez, M. Rohrbach, and T. Darrell, “Remembering for the right reasons: Explanations reduce catastrophic forgetting,” Applied AI Letters, 2021.
  41. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” International Journal of Computer Vision, 2019.
  42. G. Saha and K. Roy, “Saliency guided experience packing for replay in continual learning,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 5273–5283.
  43. G. Bellitto, F. Proietto Salanitri, S. Palazzo, F. Rundo, D. Giordano, and C. Spampinato, “Hierarchical domain-adapted feature learning for video saliency prediction,” International Journal of Computer Vision, vol. 129, pp. 3216–3232, 2021.
  44. Z. Wang, Z. Liu, G. Li, Y. Wang, T. Zhang, L. Xu, and J. Wang, “Spatio-temporal self-attention network for video saliency prediction,” IEEE Transactions on Multimedia, 2021.
  45. F. Hu, S. Palazzo, F. P. Salanitri, G. Bellitto, M. Moradi, C. Spampinato, and K. McGuinness, “Tinyhd: Efficient video saliency prediction with heterogeneous decoders using hierarchical maps distillation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023.
  46. A. Chaudhry, M. Rohrbach, M. Elhoseiny, T. Ajanthan, P. K. Dokania, P. H. Torr, and M. Ranzato, “On tiny episodic memories in continual learning,” in International Conference on Machine Learning Workshop, 2019.
  47. S. Ebrahimi, F. Meier, R. Calandra, T. Darrell, and M. Rohrbach, “Adversarial continual learning,” in Proceedings of the European Conference on Computer Vision, 2020.
  48. M. M. Derakhshani, X. Zhen, L. Shao, and C. Snoek, “Kernel continual learning,” in International Conference on Machine Learning, 2021.
  49. M. Boschini, L. Bonicelli, P. Buzzega, A. Porrello, and S. Calderara, “Class-incremental continual learning into the extended der-verse,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  50. Z. Li and D. Hoiem, “Learning without forgetting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
  51. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, “Overcoming catastrophic forgetting in neural networks,” Proc. of the National Academy of Sciences, 2017.
  52. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2016.
  53. S. Farquhar and Y. Gal, “Towards Robust Evaluations of Continual Learning,” in International Conference on Machine Learning Workshop, 2018.
  54. G. M. van de Ven and A. S. Tolias, “Three continual learning scenarios,” in Neural Information Processing Systems Workshops, 2018.
  55. Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, and Y. Fu, “Large scale incremental learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2019.
  56. S. Hou, X. Pan, C. C. Loy, Z. Wang, and D. Lin, “Learning a unified classifier incrementally via rebalancing,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2019.
  57. J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow twins: Self-supervised learning via redundancy reduction,” in International Conference on Machine Learning, 2021.
  58. N. Li, D. D. Cox, D. Zoccolan, and J. J. DiCarlo, “What response properties do individual neurons need to underlie position and clutter "invariant" object recognition?” J Neurophysiol, vol. 102, no. 1, pp. 360–376, Jul 2009.
  59. T. Lesort, “Continual feature selection: Spurious features in continual learning,” 2022.
  60. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations, 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 0 likes.