Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified Framework (2306.13275v1)

Published 23 Jun 2023 in cs.LG and cs.CV

Abstract: The Long-Tailed Recognition (LTR) problem emerges in the context of learning from highly imbalanced datasets, in which the number of samples among different classes is heavily skewed. LTR methods aim to accurately learn a dataset comprising both a larger Head set and a smaller Tail set. We propose a theorem where under the assumption of strong convexity of the loss function, the weights of a learner trained on the full dataset are within an upper bound of the weights of the same learner trained strictly on the Head. Next, we assert that by treating the learning of the Head and Tail as two separate and sequential steps, Continual Learning (CL) methods can effectively update the weights of the learner to learn the Tail without forgetting the Head. First, we validate our theoretical findings with various experiments on the toy MNIST-LT dataset. We then evaluate the efficacy of several CL strategies on multiple imbalanced variations of two standard LTR benchmarks (CIFAR100-LT and CIFAR10-LT), and show that standard CL methods achieve strong performance gains in comparison to baselines and approach solutions that have been tailor-made for LTR. We also assess the applicability of CL techniques on real-world data by exploring CL on the naturally imbalanced Caltech256 dataset and demonstrate its superiority over state-of-the-art classifiers. Our work not only unifies LTR and CL but also paves the way for leveraging advances in CL methods to tackle the LTR challenge more effectively.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. “A systematic study of the class imbalance problem in convolutional neural networks,” Neural Networks, vol. 106, pp. 249–259, 2018.
  2. William J Reed, “The pareto, zipf and other power laws,” Economics letters, vol. 74, no. 1, pp. 15–19, 2001.
  3. “Deep long-tailed learning: A survey,” arXiv preprint arXiv:2110.04596, 2021.
  4. “Long-tailed visual recognition with deep models: A methodological survey and evaluation,” Neurocomputing, 2022.
  5. “Long-tailed recognition via weight balancing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6897–6907.
  6. “Smote: synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.
  7. “A multiple resampling method for learning from imbalanced data sets,” Computational Intelligence, vol. 20, no. 1, pp. 18–36, 2004.
  8. “Exploring classification equilibrium in long-tailed object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3417–3426.
  9. “Large-scale long-tailed recognition in an open world,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2537–2546.
  10. “Learning to model the tail,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  11. “Unequal-training for deep face recognition with long-tailed noisy data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7812–7821.
  12. “Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7610–7619.
  13. “Learning imbalanced datasets with label-distribution-aware margin loss,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  14. “Class-balanced loss based on effective number of samples,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9268–9277.
  15. “Long-tailed classification by keeping the good and removing the bad momentum causal effect,” Advances in Neural Information Processing Systems, vol. 33, pp. 1513–1524, 2020.
  16. “Decoupling representation and classifier for long-tailed recognition,” arXiv preprint arXiv:1910.09217, 2019.
  17. “Borderline-smote: a new over-sampling method in imbalanced data sets learning,” in Advances in Intelligent Computing: International Conference on Intelligent Computing, ICIC 2005, Hefei, China, August 23-26, 2005, Proceedings, Part I 1. Springer, 2005, pp. 878–887.
  18. “C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling,” in Workshop on learning from Imbalanced Datasets II, 2003, vol. 11, pp. 1–8.
  19. “Relay backpropagation for effective learning of deep convolutional neural networks,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII 14. Springer, 2016, pp. 467–482.
  20. “Exploring the limits of weakly supervised pretraining,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 181–196.
  21. “Deep imbalanced learning for face recognition and attribute prediction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 11, pp. 2781–2794, 2019.
  22. “Focal loss for dense object detection,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.
  23. “Striking the right balance with uncertainty,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 103–112.
  24. “Feature transfer learning for face recognition with under-represented data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5704–5713.
  25. “Procrustean training for imbalanced deep learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 92–102.
  26. “A theoretical analysis of the learning dynamics under class imbalance,” arXiv preprint arXiv:2207.00391, 2022.
  27. “Incremental learning in deep convolutional neural networks using partial network sharing,” IEEE Access, vol. 8, pp. 4615–4628, 2019.
  28. “Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting,” in International Conference on Machine Learning. PMLR, 2019, pp. 3925–3934.
  29. “Scalable and order-robust continual learning with additive parameter decomposition,” arXiv preprint arXiv:1902.09432, 2019.
  30. “Gradient projection memory for continual learning,” arXiv preprint arXiv:2103.09762, 2021.
  31. “Space: Structured compression and sharing of representational space for continual learning,” IEEE Access, vol. 9, pp. 150480–150494, 2021.
  32. “Orthogonal gradient descent for continual learning,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2020, pp. 3762–3773.
  33. “Overcoming catastrophic forgetting in neural networks,” National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
  34. “Learning without forgetting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
  35. “Learning to learn without forgetting by maximizing transfer and minimizing interference,” arXiv preprint arXiv:1810.11910, 2018.
  36. “Efficient lifelong learning with a-gem,” arXiv preprint arXiv:1812.00420, 2018.
  37. “Online class-incremental continual learning with adversarial shapley value,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, pp. 9630–9638.
  38. “Experience replay for continual learning,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  39. “Optimal rates for random order online optimization,” Advances in Neural Information Processing Systems, vol. 34, pp. 2097–2108, 2021.
  40. “Gdumb: A simple approach that questions our progress in continual learning,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, 2020, pp. 524–540.
  41. “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
  42. “Caltech-256 object category dataset,” 2007.
  43. “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
  44. “Long-tail learning via logit adjustment,” arXiv preprint arXiv:2007.07314, 2020.
  45. “Rethinking the value of labels for improving class-imbalanced learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 19290–19301, 2020.
  46. “Self supervision to distillation for long-tailed visual recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 630–639.
  47. “Distilling virtual examples for long-tailed recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 235–244.
  48. “Distributional robustness loss for long-tail learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9495–9504.
  49. “Parametric contrastive learning,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 715–724.
  50. “mixup: Beyond empirical risk minimization,” arXiv preprint arXiv:1710.09412, 2017.
  51. “Large scale fine-grained categorization and domain-specific transfer learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4109–4118.
  52. “Don’t forget, there is more than forgetting: new metrics for continual learning,” 2018.
  53. “Contrastive learning based hybrid networks for long-tailed image classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 943–952.
  54. “Improving calibration for long-tailed recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16489–16498.
  55. “Balanced contrastive learning for long-tailed visual recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6908–6917.
  56. “Delta: Deep learning transfer using feature map with attention for convolutional networks,” arXiv preprint arXiv:1901.09229, 2019.
  57. “Transtailor: Pruning the pre-trained model for improved transfer learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, pp. 8627–8634.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Mahdiyar Molahasani (6 papers)
  2. Michael Greenspan (30 papers)
  3. Ali Etemad (118 papers)
Citations (1)