Convolutional neural networks with low-rank regularization (1511.06067v3)

Published 19 Nov 2015 in cs.LG, cs.CV, and stat.ML

Abstract: Large CNNs have delivered impressive performance in various computer vision applications. But the storage and computation requirements make it problematic for deploying these models on mobile devices. Recently, tensor decompositions have been used for speeding up CNNs. In this paper, we further develop the tensor decomposition technique. We propose a new algorithm for computing the low-rank tensor decomposition for removing the redundancy in the convolution kernels. The algorithm finds the exact global optimizer of the decomposition and is more effective than iterative methods. Based on the decomposition, we further propose a new method for training low-rank constrained CNNs from scratch. Interestingly, while achieving a significant speedup, sometimes the low-rank constrained CNNs delivers significantly better performance than their non-constrained counterparts. On the CIFAR-10 dataset, the proposed low-rank NIN model achieves $91.31\%$ accuracy (without data augmentation), which also improves upon state-of-the-art result. We evaluated the proposed method on CIFAR-10 and ILSVRC12 datasets for a variety of modern CNNs, including AlexNet, NIN, VGG and GoogleNet with success. For example, the forward time of VGG-16 is reduced by half while the performance is still comparable. Empirical success suggests that low-rank tensor decompositions can be a very useful tool for speeding up large CNNs.

Authors (5)

Cheng Tai (8 papers)
Tong Xiao (119 papers)
Yi Zhang (994 papers)
Xiaogang Wang (230 papers)
Weinan E (127 papers)

Citations (451)

View on Semantic Scholar

Summary

Insights on "Convolutional Neural Networks With Low-rank Regularization"

The paper "Convolutional Neural Networks With Low-rank Regularization" introduces innovative methodologies for enhancing the efficiency of convolutional neural networks (CNNs) without compromising their performance. This paper primarily focuses on the application of low-rank tensor decompositions to optimize the storage and computational costs of CNNs, a crucial advancement for deploying models on resource-constrained mobile devices.

Core Contributions

The authors present several key contributions that refine the low-rank decomposition methodology for CNNs:

Exact Global Optimizer: A new algorithm is proposed for computing low-rank tensor decompositions that deterministically finds the global optimum. Unlike iterative approaches that often converge to local minima, this algorithm leverages a closed-form solution, providing a robust foundation for low-rank approximations.
Training from Scratch: The paper introduces a method to train CNNs with low-rank constraints from scratch. The approach naturally enforces the low-rank constraint within the parameterization of the model, allowing for deeper architectures and improved performance metrics. Training challenges are mitigated with the incorporation of batch normalization techniques.
Extensive Experimentation with Modern CNNs: The paper evaluates the proposed methods on various CNN architectures including AlexNet, NIN, VGG, and GoogLeNet, across popular datasets like CIFAR-10 and ILSVRC12. The results are promising, showing considerable reduction in computational costs while maintaining or even enhancing model accuracy.

Empirical Findings

Through a series of detailed experiments, the authors report:

On the CIFAR-10 dataset, their low-rank NIN model achieved an accuracy of 91.31%, outperforming existing state-of-the-art results. This is accomplished without any data augmentation, indicating the efficacy of their approach in handling small datasets.
Substantial efficiency gains were observed across different models. For instance, the VGG-16 model realized a reduction in forward time by half, maintaining comparable performance levels.
Detailed analysis reveals that the reduction in storage and computational requirements is achieved without significant trade-offs in accuracy, counter to typical trends observed in previous studies.

Theoretical and Practical Implications

The implications of applying low-rank regularization to CNNs are twofold:

Theoretical: The discovery that low-rank constrained CNNs can outperform their unconstrained counterparts raises questions about the nature of local minima and the potential for improved initialization strategies. It suggests that enforcing low-rank structures may implicitly act as a form of regularization, aiding in better generalization.
Practical: By demonstrating the effectiveness of low-rank techniques across mainstream CNN architectures, this work paves the way for more efficient deployment of deep learning models on edge devices. The gains in computational efficiency have immediate applications in real-time processing and mobile AI.

Future Directions

The success of this paper invites further exploration in several areas:

Refinement and Automation of Rank Selection: Future work could develop automated methods to select the rank parameter $K$ optimally, possibly incorporating adaptive or dynamic rank selection strategies.
Broader Architectural Impacts: Extending the low-rank framework to other components of neural networks, beyond convolutional layers (such as fully connected layers), presents promising avenues for broader application.
Integration with Quantization Techniques: Combining low-rank decomposition with quantization techniques could result in even greater efficiency improvements, exploring synergies between these distinct yet complementary approaches.

In conclusion, this paper offers a compelling advancement in CNN optimization, demonstrating substantial gains in efficiency with minimal impact on performance. The results reinforce the potential of low-rank approximations as a powerful tool for enabling large-scale deployment of deep neural networks across various platforms and applications.

PDF Markdown

Related Papers

Find Related Papers