DropBlock: A regularization method for convolutional networks (1810.12890v1)

Published 30 Oct 2018 in cs.CV

Abstract: Deep neural networks often work well when they are over-parameterized and trained with a massive amount of noise and regularization, such as weight decay and dropout. Although dropout is widely used as a regularization technique for fully connected layers, it is often less effective for convolutional layers. This lack of success of dropout for convolutional layers is perhaps due to the fact that activation units in convolutional layers are spatially correlated so information can still flow through convolutional networks despite dropout. Thus a structured form of dropout is needed to regularize convolutional networks. In this paper, we introduce DropBlock, a form of structured dropout, where units in a contiguous region of a feature map are dropped together. We found that applying DropbBlock in skip connections in addition to the convolution layers increases the accuracy. Also, gradually increasing number of dropped units during training leads to better accuracy and more robust to hyperparameter choices. Extensive experiments show that DropBlock works better than dropout in regularizing convolutional networks. On ImageNet classification, ResNet-50 architecture with DropBlock achieves $78.13\%$ accuracy, which is more than $1.6\%$ improvement on the baseline. On COCO detection, DropBlock improves Average Precision of RetinaNet from $36.8\%$ to $38.4\%$.

Authors (3)

Golnaz Ghiasi (20 papers)
Tsung-Yi Lin (49 papers)
Quoc V. Le (128 papers)

Citations (866)

View on Semantic Scholar

Summary

The paper introduces DropBlock, a structured dropout method that improves CNN robustness by dropping contiguous feature blocks.
The method gradually increases dropped regions during training, leading to significant performance gains on ImageNet, COCO, and PASCAL VOC.
Empirical results demonstrate enhanced classification accuracy and detection precision, validating DropBlock over traditional dropout techniques.

DropBlock: A Regularization Method for Convolutional Networks

The paper "DropBlock: A regularization method for convolutional networks" by Ghiasi, Lin, and Le addresses the limitations of traditional dropout methods in the context of convolutional neural networks (CNNs). This work introduces DropBlock, an enhanced dropout technique designed to improve the regularization of CNNs by leveraging contiguous regions in feature maps.

Motivation and Background

Dropout is a widely recognized regularization strategy, particularly effective for fully connected layers. However, its efficacy diminishes when applied to convolutional layers due to the spatial correlation of features within these layers. The core issue is that the random dropout of individual units does not entirely break the flow of information due to neighboring units retaining similar activations, leading to overfitting.

DropBlock Methodology

DropBlock proposes a structured dropout by eliminating contiguous regions of a feature map instead of independent, randomly selected units. This approach enforces CNNs to rely on more diverse spatial features, thereby reducing overfitting. DropBlock is parameterized by block_size, determining the size of the dropped regions, and γ, controlling the dropout rate.

The authors emphasize the importance of gradually increasing the number of dropped units during training. This gradual increase leads to more robust models and reduces sensitivity to hyperparameter settings. The DropBlock algorithm selectively drops out entire blocks of features, leading the network to adapt by utilizing non-dropped regions more effectively.

Experimental Results

The paper presents extensive experimental validation across several computer vision tasks:

ImageNet Classification: Using ResNet-50 architecture, DropBlock achieves an accuracy improvement of 1.62%, increasing from 76.51% to 78.13%. The paper also highlights that the method consistently outperforms traditional dropout and other structured dropout techniques like SpatialDropout and DropPath.
COCO Object Detection: Applying DropBlock to RetinaNet results in improved Average Precision (AP), from 36.8% to 38.4%. This improvement underscores the generalizability of DropBlock beyond image classification to object detection tasks.
PASCAL VOC Semantic Segmentation: DropBlock shows significant improvement when the model is trained from scratch, narrowing the performance gap to models pre-trained on ImageNet.

Analytical Insights

Several analyses are conducted to bolster the findings:

Robustness: The analysis demonstrates that models trained with DropBlock are more robust. During inference, reducing keep_prob shows higher resilience in models trained with larger block_size, indicating better generalization.
Class Activation Mapping (CAM): Visualization of activation maps indicates that DropBlock encourages the network to learn more spatially distributed features. This is evident in the dispersed activation patterns seen in models trained with DropBlock compared to those without.

Implications and Future Directions

DropBlock's introduction demonstrates that structured regularization can significantly enhance the performance and robustness of CNNs. The findings are substantial for various CNN-based applications, providing a straightforward yet powerful method to improve neural network generalization.

Looking ahead, future research could explore:

Automated Block Size Adjustment: Adapting block_size dynamically based on the learning signal might further enhance performance.
Application to Different Domains: Extending DropBlock to other neural network architectures and domains, such as speech recognition or natural language processing, could expand its utility.
Combination with Other Regularization Techniques: Investigating how DropBlock can be integrated with other advanced regularization methods to achieve compounded benefits.

Conclusion

The DropBlock methodology provides a significant improvement over conventional dropout techniques by addressing the spatial correlations in convolutional layers. The empirical results validate its efficacy across multiple vision tasks, establishing it as a valuable tool in CNN regularization. This work marks a meaningful advance in the continual effort to refine and improve neural network training paradigms.

PDF Markdown

Related Papers

YouTube

Show All Videos