Dual Path Networks (1707.01629v2)

Published 6 Jul 2017 in cs.CV

Abstract: In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally. By revealing the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, we find that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations. To enjoy the benefits from both path topologies, our proposed Dual Path Network shares common features while maintaining the flexibility to explore new features through dual path architectures. Extensive experiments on three benchmark datasets, ImagNet-1k, Places365 and PASCAL VOC, clearly demonstrate superior performance of the proposed DPN over state-of-the-arts. In particular, on the ImagNet-1k dataset, a shallow DPN surpasses the best ResNeXt-101(64x4d) with 26% smaller model size, 25% less computational cost and 8% lower memory consumption, and a deeper DPN (DPN-131) further pushes the state-of-the-art single model performance with about 2 times faster training speed. Experiments on the Places365 large-scale scene dataset, PASCAL VOC detection dataset, and PASCAL VOC segmentation dataset also demonstrate its consistently better performance than DenseNet, ResNet and the latest ResNeXt model over various applications.

Citations (806)

View on Semantic Scholar

Summary

The paper introduces Dual Path Networks that integrate ResNet’s residual additivity with DenseNet’s dense concatenation to enhance both feature reuse and exploration.
It achieves superior performance on benchmarks like ImageNet and Places365 while reducing model size, computational cost, and memory overhead.
Extensive experiments confirm that DPN outperforms state-of-the-art architectures in tasks such as image classification, object detection, and semantic segmentation.

Dual Path Networks

The paper "Dual Path Networks" proposes an innovative network architecture for image classification that combines the beneficial traits of two prominent deep learning architectures: Residual Networks (ResNet) and Dense Convolutional Networks (DenseNet). The authors introduced a new network topology, termed Dual Path Networks (DPN), designed to enhance the efficiency of feature reuse and exploration, ultimately leading to superior performance in various computer vision tasks. This essay provides a comprehensive overview of the paper, focusing on the key contributions, experimental results, and the potential implications for the field.

Core Contributions

The central contributions of the paper are as follows:

Unified Analysis of ResNet and DenseNet: The paper offers a higher-order recurrent neural network (HORNN) perspective on ResNet and DenseNet, demonstrating that ResNet facilitates feature reuse via additivity, while DenseNet facilitates continuous feature exploration through concatenation.
Introduction of Dual Path Networks (DPN): DPNs combine the additivity of ResNet's residual connections with the concatenative property of DenseNet's dense connections. This dual-path approach achieves high parameter efficiency and enables both feature reuse and novel feature discovery.
Empirical Validation: Extensive experiments conducted on multiple benchmark datasets, including ImageNet-1k, Places365, and PASCAL VOC, validate that DPNs exceed the performance of state-of-the-art architectures with reduced model size, computation, and memory overhead.

Technical Details

Dual Path Architecture: The proposed DPN leverages dual-path topologies to balance feature reuse and exploration. The architecture is defined by equations delineating the densely connected path for feature exploration and the residual path for feature reusage:

$x^{k}$ denotes features extracted from earlier steps in the dense path.
$y^{k}$ represents cumulatively added features in the residual path.
The outputs of these paths are combined, ensuring efficient feature representation.

Complexity Analysis: Compared to DenseNet and ResNeXt models, DPNs demonstrated significant reductions in model size and computational cost while maintaining lower memory consumption. For example, DPN-92 achieved superior accuracy over ResNeXt-101 with 26% fewer parameters and 25% less FLOPs.

Experimental Results

Image Classification: DPNs showcased their efficacy on the ImageNet-1k dataset, outperforming various architectures like ResNeXt and DenseNet. For instance, the DPN-92 model achieved a top-1 error of 20.7% compared to ResNeXt-101's 21.2%, albeit with lesser computational resources.

Scene Classification: The robustness of DPN was also evident on the Places365 dataset, where the DPN-92 attained the highest top-1 accuracy of 56.84%, surpassing models such as VGG-16 and ResNet-152.

Object Detection and Semantic Segmentation: The performance of DPN was validated across object detection and semantic segmentation tasks using the PASCAL VOC dataset. DPN-92 demonstrated an mAP of 82.5% and an mIoU of 74.8%, considerably outperforming ResNet-101 and DenseNet-161.

Implications and Future Directions

The introduction of DPN offers remarkable enhancements in computational efficiency and accuracy, providing a practical solution for various computer vision tasks. The unified understanding of ResNet and DenseNet offered in the paper also paves the way for further exploration into hybrid architectures that can leverage the strengths of multiple topologies. Potential future research could investigate:

Extending DPNs to other tasks: Beyond image-based tasks, DPNs could be adapted for use in sequential data or multi-modal learning scenarios.
Optimization and Scalability: Further optimization of the DPN architecture for deployment on resource-constrained devices could be explored. Additionally, investigating the scalability of DPNs could provide insights into maintaining efficiency in increasingly deeper or wider networks.
Higher Order RNNs and DPN: The HORNN perspective suggests that the dual-path strategy could enhance performance in RNN-based tasks, potentially driving advancements in LLMing and time-series prediction.

Conclusion

The paper "Dual Path Networks" makes a significant contribution to the design of convolutional neural network architectures by integrating the complementary strengths of ResNet and DenseNet. Through rigorous theoretical analysis and empirical validation, the paper demonstrates that DPNs are not only more efficient in terms of parameter utilization and computational cost but also achieve superior accuracy across a range of computer vision tasks. This innovative approach holds promising implications for future advancements in neural network architectures and their applications.

PDF Markdown