Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning (2405.11829v1)
Abstract: Continual learning focuses on learning non-stationary data distribution without forgetting previous knowledge. Rehearsal-based approaches are commonly used to combat catastrophic forgetting. However, these approaches suffer from a problem called "rehearsal memory overfitting, " where the model becomes too specialized on limited memory samples and loses its ability to generalize effectively. As a result, the effectiveness of the rehearsal memory progressively decays, ultimately resulting in catastrophic forgetting of the learned tasks. We introduce the Adversarially Diversified Rehearsal Memory (ADRM) to address the memory overfitting challenge. This novel method is designed to enrich memory sample diversity and bolster resistance against natural and adversarial noise disruptions. ADRM employs the FGSM attacks to introduce adversarially modified memory samples, achieving two primary objectives: enhancing memory diversity and fostering a robust response to continual feature drifts in memory samples. Our contributions are as follows: Firstly, ADRM addresses overfitting in rehearsal memory by employing FGSM to diversify and increase the complexity of the memory buffer. Secondly, we demonstrate that ADRM mitigates memory overfitting and significantly improves the robustness of CL models, which is crucial for safety-critical applications. Finally, our detailed analysis of features and visualization demonstrates that ADRM mitigates feature drifts in CL memory samples, significantly reducing catastrophic forgetting and resulting in a more resilient CL model. Additionally, our in-depth t-SNE visualizations of feature distribution and the quantification of the feature similarity further enrich our understanding of feature representation in existing CL approaches. Our code is publically available at https://github.com/hikmatkhan/ADRM.
- L. Wang, X. Zhang, H. Su, and J. Zhu, “A comprehensive survey of continual learning: Theory, method and application,” arXiv preprint arXiv:2302.00487, 2023.
- H. Khan, P. M. Shah, M. A. Shah, S. ul Islam, and J. J. Rodrigues, “Cascading handcrafted features and convolutional neural network for iot-enabled brain tumor segmentation,” Computer communications, vol. 153, pp. 196–207, 2020.
- K. Hikmat, R. Ghulam, B. Nidhal, C, and J. Charles C, “Rotorcraft flight information inference from cockpit videos using deep learning,” American Helicopter Society 75th Annual Forum, Philadelphia, Pennsylvania, USA, May 2019.
- K. Hikmat, R. Ghulam, B. Nidhal, C, , T. Tyler, T. Lacey, and J. Charles C, “Explainable ai: Rotorcraft attitude prediction,” Vertical Flight Society’s 76th Annual Forum and Technology Display, Virginia Beach, Virginia, USA, Oct 2020.
- K. Hikmat, R. Ghulam, B. Nidhal, C, T. Tyler, T. Lacey, and J. Charles C, “Deep ensemble for rotorcraft attitude prediction,” Vertical Flight Society’s 77th Annual Forum and Technology Display, Palm Beach, Florida, USA, May 2021.
- H. Khan, S. F. Alam Zaidi, A. Safi, and S. Ud Din, “A comprehensive analysis of mri based brain tumor segmentation using conventional and deep learning methods,” in Intelligent Computing Systems: Third International Symposium, ISICS 2020, Sharjah, United Arab Emirates, March 18–19, 2020, Proceedings 3. Springer, 2020, pp. 92–104.
- P. M. Shah, H. Khan, U. Shafi, S. u. Islam, M. Raza, T. T. Son, and H. Le-Minh, “2d-cnn based segmentation of ischemic stroke lesions in mri scans,” in Advances in Computational Collective Intelligence: 12th International Conference, ICCCI 2020, Da Nang, Vietnam, November 30–December 3, 2020, Proceedings 12. Springer, 2020, pp. 276–286.
- M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 7, pp. 3366–3385, 2021.
- M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” in Psychology of learning and motivation. Elsevier, 1989, vol. 24, pp. 109–165.
- H. Khan, N. C. Bouaynaya, and G. Rasool, “Adversarially robust continual learning,” in 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022, pp. 1–8.
- D. Rolnick, A. Ahuja, J. Schwarz, T. Lillicrap, and G. Wayne, “Experience replay for continual learning,” Advances in Neural Information Processing Systems, vol. 32, 2019.
- H. Khan, “Brain-inspired continual learning: Rethinking the role of features in the stability-plasticity dilemma,” 2024.
- H. Khan, N. C. Bouaynaya, and G. Rasool, “Brain-inspired continual learning: Robust feature distillation and re-consolidation for class incremental learning,” IEEE Access, vol. 12, pp. 34 054–34 073, 2024.
- ——, “The importance of robust features in mitigating catastrophic forgetting,” in 2023 IEEE Symposium on Computers and Communications (ISCC). IEEE, 2023, pp. 752–757.
- H. Khan, P. M. Shah, S. F. A. Zaidi et al., “Susceptibility of continual learning against adversarial attacks,” arXiv preprint arXiv:2207.05225, 2022.
- Z. Wang, L. Shen, Q. Suo, T. Duan, Y. Zhu, T. Liu, and M. Gao, “Make memory buffer stronger in continual learning: A continuous neural transformation approach,” 2022.
- X. Chen, H. Wu, and X. Shi, “Consistent prototype learning for few-shot continual relation extraction,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 7409–7422.
- W. Shi and M. Ye, “Prototype reminiscence and augmented asymmetric knowledge aggregation for non-exemplar class-incremental learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 1772–1781.
- Z. Wang, L. Shen, L. Fang, Q. Suo, T. Duan, and M. Gao, “Improving task-free continual learning by distributionally robust memory evolution,” in International Conference on Machine Learning. PMLR, 2022, pp. 22 985–22 998.
- F. Ye and A. G. Bors, “Learning an evolved mixture model for task-free continual learning,” in 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022, pp. 1936–1940.
- X. Jin, J. Du, and X. Ren, “Gradient based memory editing for task-free continual learning,” in 4th Lifelong Machine Learning Workshop at ICML 2020, 2020.
- J. Bang, H. Kim, Y. Yoo, J.-W. Ha, and J. Choi, “Rainbow memory: Continual learning with a memory of diverse samples,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8218–8227.
- Y.-M. Tang, Y.-X. Peng, and W.-S. Zheng, “Learning to imagine: Diversify memory for incremental learning using unlabeled data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9549–9558.
- D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” in International Conference on Learning Representations (ICLR), 2018.
- L. Engstrom, A. Ilyas, S. Santurkar, D. Tsipras, B. Tran, and A. Madry, “Adversarial robustness as a prior for learned representations,” arXiv preprint arXiv:1906.00945, 2019.
- L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
- S. Kornblith, M. Norouzi, H. Lee, and G. Hinton, “Similarity of neural network representations revisited,” in International conference on machine learning. PMLR, 2019, pp. 3519–3529.
- E. Wong, L. Rice, and J. Z. Kolter, “Fast is better than free: Revisiting adversarial training,” in International Conference on Learning Representations, 2019.
- A. Chaudhry, M. Rohrbach, M. Elhoseiny, T. Ajanthan, P. Dokania, P. Torr, and M. Ranzato, “Continual learning with tiny episodic memories,” in Workshop on Multi-Task and Lifelong Reinforcement Learning, 2019.
- X. Jin, A. Sadhu, J. Du, and X. Ren, “Gradient-based editing of memory examples for online task-free continual learning,” Advances in Neural Information Processing Systems, vol. 34, pp. 29 193–29 205, 2021.
- A. Shafahi, M. Najibi, M. A. Ghiasi, Z. Xu, J. Dickerson, C. Studer, L. S. Davis, G. Taylor, and T. Goldstein, “Adversarial training for free!” Advances in Neural Information Processing Systems, vol. 32, 2019.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
- B. Zhao, X. Xiao, G. Gan, B. Zhang, and S.-T. Xia, “Maintaining discrimination and fairness in class incremental learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 13 208–13 217.
- Y. Wu, Y. Chen, L. Wang, Y. Ye, Z. Liu, Y. Guo, and Y. Fu, “Large scale incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 374–382.
- D.-W. Zhou, H.-J. Ye, D.-C. Zhan, and Z. Liu, “Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need,” arXiv preprint arXiv:2303.07338, 2023.
- D. Hendrycks and T. Dietterich, “Benchmarking neural network robustness to common corruptions and perturbations,” in International Conference on Learning Representations, 2018.
- M. Masana, X. Liu, B. Twardowski, M. Menta, A. D. Bagdanov, and J. Van De Weijer, “Class-incremental learning: survey and performance evaluation on image classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 5, pp. 5513–5533, 2022.
- G. M. Van de Ven and A. S. Tolias, “Three scenarios for continual learning,” Advances in Neural Information Processing Systems (NIPS), 2019.
- D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” Advances in Neural Information Processing Systems (NIPS), vol. 30, 2017.
- S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “icarl: Incremental classifier and representation learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010.
- A. Douillard, M. Cord, C. Ollion, T. Robert, and E. Valle, “Podnet: Pooled outputs distillation for small-tasks incremental learning,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16. Springer, 2020, pp. 86–102.
- S. Yan, J. Xie, and X. He, “Der: Dynamically expandable representation for class incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3014–3023.
- F.-Y. Wang, D.-W. Zhou, H.-J. Ye, and D.-C. Zhan, “Foster: Feature boosting and compression for class-incremental learning,” in European conference on computer vision. Springer, 2022, pp. 398–414.
- G. Petit, A. Popescu, H. Schindler, D. Picard, and B. Delezoide, “Fetril: Feature translation for exemplar-free class-incremental learning,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 3911–3920.
- D.-W. Zhou, Q.-W. Wang, H.-J. Ye, and D.-C. Zhan, “A model or 603 exemplars: Towards memory-efficient class-incremental learning,” arXiv preprint arXiv:2205.13218, 2022.
- A. Chaudhry, M. Rohrbach, M. Elhoseiny, T. Ajanthan, P. K. Dokania, P. H. Torr, and M. Ranzato, “On tiny episodic memories in continual learning,” arXiv preprint arXiv:1902.10486, 2019.
- A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” 2017.
- D.-W. Zhou, F.-Y. Wang, H.-J. Ye, and D.-C. Zhan, “Pycil: A python toolbox for class-incremental learning,” 2023.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2010, pp. 249–256.
- K. Alomar, H. I. Aysel, and X. Cai, “Data augmentation in classification and segmentation: A survey and new strategies,” Journal of Imaging, vol. 9, no. 2, p. 46, 2023.
- A. Douillard, A. Ramé, G. Couairon, and M. Cord, “Dytox: Transformers for continual learning with dynamic token expansion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9285–9295.
- A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” Advances in Neural Information Processing Systems (NIPS), vol. 32, 2019.
- Hikmat Khan (7 papers)
- Ghulam Rasool (32 papers)
- Nidhal Carla Bouaynaya (3 papers)