IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks (2404.16331v2)
Abstract: Model Weight Averaging (MWA) is a technique that seeks to enhance model's performance by averaging the weights of multiple trained models. This paper first empirically finds that 1) the vanilla MWA can benefit the class-imbalanced learning, and 2) performing model averaging in the early epochs of training yields a greater performance improvement than doing that in later epochs. Inspired by these two observations, in this paper we propose a novel MWA technique for class-imbalanced learning tasks named Iterative Model Weight Averaging (IMWA). Specifically, IMWA divides the entire training stage into multiple episodes. Within each episode, multiple models are concurrently trained from the same initialized model weight, and subsequently averaged into a singular model. Then, the weight of this average model serves as a fresh initialization for the ensuing episode, thus establishing an iterative learning paradigm. Compared to vanilla MWA, IMWA achieves higher performance improvements with the same computational cost. Moreover, IMWA can further enhance the performance of those methods employing EMA strategy, demonstrating that IMWA and EMA can complement each other. Extensive experiments on various class-imbalanced learning tasks, i.e., class-imbalanced image classification, semi-supervised class-imbalanced image classification and semi-supervised object detection tasks showcase the effectiveness of our IMWA.
- Long-tailed recognition via weight balancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6897–6907, 2022.
- Ace: Ally complementary experts for solving long-tailed recognition in one-shot. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 112–121, 2021.
- Swad: Domain generalization by seeking flat minima. Advances in Neural Information Processing Systems, 34:22405–22418, 2021.
- Label matching semi-supervised object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14381–14390, 2022.
- Dense learning based semi-supervised object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4815–4824, 2022.
- An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 215–223. JMLR Workshop and Conference Proceedings, 2011.
- Parametric contrastive learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 715–724, 2021.
- Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9268–9277, 2019.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Lpt: Long-tailed prompt tuning for image classification. arXiv preprint arXiv:2210.01033, 2022.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010.
- Cossl: Co-learning of representation and classifier for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14574–14584, 2022.
- Linear mode connectivity and the lottery ticket hypothesis. In International Conference on Machine Learning, pages 3259–3269. PMLR, 2020.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.
- What makes imagenet good for transfer learning? arXiv preprint arXiv:1608.08614, 2016.
- Patching open-vocabulary models by interpolating weights. arXiv preprint arXiv:2208.05592, 2022.
- Averaging weights leads to wider optima and better generalization. arXiv preprint arXiv:1803.05407, 2018.
- Dart: Diversify-aggregate-repeat training improves generalization of neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16048–16059, 2023.
- Distribution aligning refinery of pseudo-label for imbalanced semi-supervised learning. Advances in neural information processing systems, 33:14567–14579, 2020.
- Mum: Mix image tiles and unmix feature tiles for semi-supervised object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14512–14521, 2022.
- Learning multiple layers of features from tiny images. 2009.
- Abc: Auxiliary balanced classifier for class-imbalanced semi-supervised learning. Advances in Neural Information Processing Systems, 34:7082–7094, 2021.
- Semi-supervised object detection via multi-instance alignment with global class prototypes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9809–9818, 2022.
- Trustworthy long-tailed classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6970–6979, 2022.
- Dtg-ssod: Dense teacher guidance for semi-supervised object detection. Advances in neural information processing systems, 35:8840–8852, 2022.
- Rethinking pseudo labels for semi-supervised object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 1314–1322, 2022.
- Nested collaborative learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6949–6958, 2022.
- Long-tailed visual recognition via gaussian clouded logit adjustment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6929–6938, 2022.
- Targeted supervised contrastive learning for long-tailed recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6918–6928, 2022.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Breadcrumbs: Adversarial class-balanced sampling for long-tailed recognition. In European Conference on Computer Vision, pages 637–653. Springer, 2022.
- Cycle self-training for semi-supervised object detection with distribution consistency reweighting. In Proceedings of the 30th ACM International Conference on Multimedia, pages 6569–6578, 2022.
- Unbiased teacher for semi-supervised object detection. arXiv preprint arXiv:2102.09480, 2021.
- Unbiased teacher v2: Semi-supervised object detection for anchor-free and anchor-based detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9819–9828, 2022.
- Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2537–2546, 2019.
- Retrieval augmented classification for long-tail visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6959–6969, 2022.
- A simple long-tailed recognition baseline via vision-language model. arXiv preprint arXiv:2111.14745, 2021.
- Merging models with fisher-weighted averaging. Advances in Neural Information Processing Systems, 35:17703–17716, 2022.
- The jensen-shannon divergence. Journal of the Franklin Institute, 334(2):307–318, 1997.
- Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314, 2020.
- Daso: Distribution-aware semantics-oriented pseudo-label for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9786–9796, 2022.
- Influence-balanced loss for imbalanced visual classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 735–744, 2021.
- Diverse weight averaging for out-of-distribution generalization. arXiv preprint arXiv:2205.09739, 2022.
- Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems, 33:596–608, 2020.
- A simple semi-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757, 2020.
- Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
- Equalization loss for long-tailed object recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11662–11671, 2020.
- Long-tailed classification by keeping the good and removing the bad momentum causal effect. Advances in Neural Information Processing Systems, 33:1513–1524, 2020.
- Humble teachers teach better students for semi-supervised object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3132–3141, 2021.
- Vl-ltr: Learning class-wise visual-linguistic representation for long-tailed visual recognition. In European Conference on Computer Vision, pages 73–91. Springer, 2022.
- The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8769–8778, 2018.
- Towards calibrated hyper-sphere representation via distribution overlap coefficient for long-tailed learning. In European Conference on Computer Vision, pages 179–196. Springer, 2022.
- Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10857–10866, 2021.
- Towards realistic long-tailed semi-supervised learning: Consistency is all you need. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3469–3478, 2023.
- Transfer and share: semi-supervised learning from long-tailed data. Machine Learning, pages 1–18, 2022.
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In International Conference on Machine Learning, pages 23965–23998. PMLR, 2022.
- Learning from multiple experts: Self-paced knowledge distillation for long-tailed classification. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, pages 247–263. Springer, 2020.
- End-to-end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3060–3069, 2021.
- Constructing balance from imbalance for long-tailed image recognition. In European Conference on Computer Vision, pages 38–56. Springer, 2022.
- Learning imbalanced data with vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15793–15803, 2023.
- Interactive self-training with mean teachers for semi-supervised object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5941–5950, 2021.
- A re-balancing strategy for class-imbalanced classification based on instance difficulty. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 70–79, 2022.
- Lookaround optimizer: k𝑘kitalic_k steps around, 1 step average. arXiv preprint arXiv:2306.07684, 2023.
- Test-agnostic long-tailed recognition by test-time aggregating diverse experts with self-supervision. arXiv e-prints, pages arXiv–2107, 2021.
- Improving calibration for long-tailed recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16489–16498, 2021.
- Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9719–9728, 2020.
- Places: A 10 million image database for scene recognition. IEEE transactions on pattern analysis and machine intelligence, 40(6):1452–1464, 2017.
- Dense teacher: Dense pseudo-labels for semi-supervised object detection. In European Conference on Computer Vision, pages 35–50. Springer, 2022.
- Instant-teaching: An end-to-end semi-supervised object detection framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4081–4090, 2021.
- Balanced contrastive learning for long-tailed visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6908–6917, 2022.
- Zitong Huang (15 papers)
- Ze Chen (38 papers)
- Bowen Dong (27 papers)
- Chaoqi Liang (5 papers)
- Erjin Zhou (20 papers)
- Wangmeng Zuo (279 papers)