CBM: Curriculum by Masking (2407.05193v2)

Published 6 Jul 2024 in cs.CV, cs.AI, and cs.LG

Abstract: We propose Curriculum by Masking (CBM), a novel state-of-the-art curriculum learning strategy that effectively creates an easy-to-hard training schedule via patch (token) masking, offering significant accuracy improvements over the conventional training regime and previous curriculum learning (CL) methods. CBM leverages gradient magnitudes to prioritize the masking of salient image regions via a novel masking algorithm and a novel masking block. Our approach enables controlling sample difficulty via the patch masking ratio, generating an effective easy-to-hard curriculum by gradually introducing harder samples as training progresses. CBM operates with two easily configurable parameters, i.e. the number of patches and the curriculum schedule, making it a versatile curriculum learning approach for object recognition and detection. We conduct experiments with various neural architectures, ranging from convolutional networks to vision transformers, on five benchmark data sets (CIFAR-10, CIFAR-100, ImageNet, Food-101 and PASCAL VOC), to compare CBM with conventional as well as curriculum-based training regimes. Our results reveal the superiority of our strategy compared with the state-of-the-art curriculum learning regimes. We also observe improvements in transfer learning contexts, where CBM surpasses previous work by considerable margins in terms of accuracy. We release our code for free non-commercial use at https://github.com/CroitoruAlin/CBM.

Summary

The paper proposes a gradient-based masking mechanism that dynamically adjusts training difficulty by masking salient image patches.
It introduces a linear repeat curriculum schedule that alternates between easy and hard examples to prevent overfitting.
Extensive evaluations on CIFAR-10, ImageNet, and PASCAL VOC demonstrate CBM’s superior performance over conventional and other curriculum learning methods.

Curriculum by Masking (CBM): A Novel Curriculum Learning Strategy

The paper "CBM: Curriculum by Masking" introduces a new and effective approach to curriculum learning in machine learning, specifically tailored for neural models in object recognition and detection tasks. This paper discusses a method that leverages a novel masking algorithm to progressively introduce harder examples during training, leading to notable improvements in accuracy over both conventional training regimes and state-of-the-art curriculum learning methods.

The core idea of CBM revolves around the concept of masking salient regions of input images to control sample difficulty. The methodology is inspired by the self-supervised pre-training techniques, particularly those utilized in masked autoencoders, yet extends these ideas to supervised learning with a novel curriculum schedule. The two primary contributions of this approach are its gradient-based masking mechanism and the easy-to-hard curriculum schedule.

Methodology

Gradient-Based Masking: The authors propose a masking procedure based on the gradient magnitudes of image patches. Patches with higher gradient magnitudes are prioritized for masking as they are more likely to contain discriminative features. This ensures that as training progresses and more patches are masked, the learning task becomes progressively harder, simulating a curriculum.

Curriculum Schedule: Several curriculum schedules are explored, from constant to linear to more sophisticated repeating schedules. The constant schedule stands as a baseline, mimicking straightforward data augmentation. More notably, a linear repeat schedule, which alternates between easier and harder examples throughout training, shows the most promising results, preventing the model from overfitting easy examples and underfitting harder ones.

Empirical Evaluation

The authors extensively evaluate CBM across different neural architectures including ResNet-18, Wide-ResNet-50, CvT-13, and YOLOv5, and on multiple benchmark datasets such as CIFAR-10, CIFAR-100, ImageNet, Food-101, and PASCAL VOC for object detection.

Object Recognition: CBM outperforms conventional training methods and other curriculum learning strategies such as CBS, LSCL, LeRaC, and EfficientTrain across all datasets and architectures. For instance, on CIFAR-10, CBM registers an accuracy improvement of 1.28% over the baseline when using ResNet-18. Similar significant gains are observed for Wide-ResNet-50 and CvT-13 on CIFAR-100 and ImageNet.

Object Detection: Applying CBM to YOLOv5 on PASCAL VOC also demonstrates its superiority, achieving a mean Average Precision (mAP) that surpasses existing methodologies, reinforcing the versatility and effectiveness of the proposed method.

Practical and Theoretical Implications

Practical Implications: CBM's implementation requires only minimal modifications to existing neural networks, making it easily applicable and highly versatile. The gradient-based masking procedure and easy-to-hard curriculum schedule are readily configurable, offering a straightforward yet potent tool for improving model performance on various tasks.
Theoretical Implications: The proposed gradient-based masking strategy introduces a new way to conceptualize and implement sample difficulty in curriculum learning. By linking difficulty levels to salient image regions, CBM effectively balances between underfitting and overfitting, suggesting a pathway for further research into more dynamic and adaptive curriculum learning techniques.

Future Directions

Several promising avenues for future research emerge from this work:

Expansion to Other Domains: Given its architecture-agnostic design, CBM can potentially be extended to other domains beyond vision, such as natural language processing and speech recognition.
Adaptive Curriculum Learning: Incorporating feedback mechanisms to dynamically adjust the curriculum based on the model’s evolving state could lead to further performance enhancements.
Combining with Data Augmentations: Integrating CBM with advanced data augmentation techniques like CutMix and Mixup may yield additional improvements, as suggested by preliminary investigations.

In conclusion, the paper "CBM: Curriculum by Masking" presents a significant advancement in the field of curriculum learning, demonstrating substantial empirical gains and offering a solid foundation for future explorations in dynamic and adaptive training schedules for neural models.

PDF Markdown

Related Papers

GitHub

GitHub - CroitoruAlin/CBM: CBM: Curriculum by Masking

Tweets

https://twitter.com/CSVisionPapers/status/1810885008137040129