HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network (2010.07621v1)

Published 15 Oct 2020 in cs.CV

Abstract: This paper addresses representational block named Hierarchical-Split Block, which can be taken as a plug-and-play block to upgrade existing convolutional neural networks, improves model performance significantly in a network. Hierarchical-Split Block contains many hierarchical split and concatenate connections within one single residual block. We find multi-scale features is of great importance for numerous vision tasks. Moreover, Hierarchical-Split block is very flexible and efficient, which provides a large space of potential network architectures for different applications. In this work, we present a common backbone based on Hierarchical-Split block for tasks: image classification, object detection, instance segmentation and semantic image segmentation/parsing. Our approach shows significant improvements over all these core tasks in comparison with the baseline. As shown in Figure1, for image classification, our 50-layers network(HS-ResNet50) achieves 81.28% top-1 accuracy with competitive latency on ImageNet-1k dataset. It also outperforms most state-of-the-art models. The source code and models will be available on: https://github.com/PaddlePaddle/PaddleClas

Citations (41)

View on Semantic Scholar

Summary

The paper introduces a Hierarchical-Split Block that enhances CNN models by extracting multi-scale features and improving performance on vision tasks.
The design balances computational efficiency and rich feature representation through tunable parameters and hierarchical split-concatenation operations.
Empirical results demonstrate that HS-ResNet50 achieves an 81.28% top-1 accuracy on ImageNet, outperforming several state-of-the-art models.

Overview of "HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network"

The paper presents HS-ResNet, a novel convolutional neural network (CNN) architecture introducing the Hierarchical-Split Block. This block is proposed as a flexible, plug-and-play module that enhances existing CNNs by enabling multi-scale feature extraction. Analyzing this method offers a significant advancement in the design of efficient network architectures, with demonstrated improvements in performance across several computer vision tasks.

Hierarchical-Split Block and Its Design

The Hierarchical-Split Block is specifically tailored to extract multi-scale features through hierarchical split and concatenation operations. The block divides feature maps into groups, which are processed sequentially. By connecting only a subset of these features directly and applying convolutions to the rest, the architecture manages to achieve a balance between computational efficiency and enhanced representation capacity.

The authors highlight an insightful dimension to the block's capabilities: larger numbers of groups within the block improve multi-scale representation but may incur inference speed costs. By tuning the parameters, such as filter width (w) and the number of groups (s), the paper demonstrates flexibility across different application scenarios without significantly increasing computational complexity or parameter count.

Performance Improvements

In rigorous empirical evaluations, HS-ResNet demonstrates significant performance gains against established baselines. On the ImageNet-1K dataset, HS-ResNet50 achieves an impressive top-1 accuracy of 81.28%, outperforming many state-of-the-art models. Additionally, the implementation of HS-ResNet in tasks such as object detection (using the COCO dataset), instance segmentation, and semantic segmentation evidences substantial improvements in mean average precision (mAP) and mean Intersection over Union (mIoU), thereby proving its versatility and robustness.

The analytical comparison in terms of network complexities reveals that the HS-ResNet maintains a competitive edge with lower resource consumption relative to similar model configurations, primarily owing to its novel use of split and concatenate operations.

Implications and Future Directions

The architectural innovations presented in this work have both theoretical and practical implications. Theoretically, they underscore a promising direction in neural network design by rethinking the traditional bottleneck structures and promoting richer feature representations. Practically, the flexibility of Hierarchical-Split blocks opens up new avenues for optimizing CNNs across various applications, including but not limited to image classification, segmentation tasks, and more broadly, computer vision tasks.

Future work may explore the applicability of Hierarchical-Split blocks in other neural architectures, such as those arising from Neural Architecture Search (NAS) or those involved in more niche applications like optical character recognition and video classification. The potential integration with NAS frameworks could also allow for further optimization of search spaces, potentially enhancing automated design processes.

Conclusion

Overall, the introduction of Hierarchical-Split blocks in HS-ResNet offers a noteworthy enhancement to CNN architectures by efficiently leveraging multi-scale feature learning. The framework's adaptability and efficiency position it as a promising approach in addressing the increasing complexity and demands of modern computer vision tasks. As the model is shared through a collaborative platform, wider adoption and experimentation could lead to further refinements and innovations in the field.

PDF Markdown

Related Papers

GitHub

GitHub - PaddlePaddle/PaddleClas: A treasure chest for visual classification and recognition powered by PaddlePaddle (5,457 stars)