Submanifold Sparse Convolutional Networks (1706.01307v1)

Published 5 Jun 2017 in cs.NE and cs.CV

Abstract: Convolutional network are the de-facto standard for analysing spatio-temporal data such as images, videos, 3D shapes, etc. Whilst some of this data is naturally dense (for instance, photos), many other data sources are inherently sparse. Examples include pen-strokes forming on a piece of paper, or (colored) 3D point clouds that were obtained using a LiDAR scanner or RGB-D camera. Standard "dense" implementations of convolutional networks are very inefficient when applied on such sparse data. We introduce a sparse convolutional operation tailored to processing sparse data that differs from prior work on sparse convolutional networks in that it operates strictly on submanifolds, rather than "dilating" the observation with every layer in the network. Our empirical analysis of the resulting submanifold sparse convolutional networks shows that they perform on par with state-of-the-art methods whilst requiring substantially less computation.

Citations (440)

View on Semantic Scholar

Summary

The paper introduces a novel framework using Sparse and Valid Sparse Convolutions to preserve the active sparse structure throughout the network.
It integrates SSCNs with popular architectures like VGG, ResNet, and DenseNet to cut memory and computational overhead by up to 50%.
Experimental evaluations on CASIA and ModelNet-40 demonstrate significant speed-ups with minimal loss in accuracy, enabling real-time applications.

Submanifold Sparse Convolutional Networks: An Expert Overview

In this paper, the authors introduce an efficient framework for processing sparse data using convolutional networks, termed Submanifold Sparse Convolutional Networks (SSCNs). The approach primarily targets the inherent inefficiencies in utilizing standard convolutional networks on sparse datasets, such as those derived from 3D LiDAR point clouds or hand-drawn images.

Key Contributions and Methodology

Traditional convolutional networks, while effective for dense data, tend to be computationally intensive due to the repetitive expansion of active sites across layers. SSCNs address this by maintaining the sparsity pattern throughout the network. The proposed method achieves this through two novel convolution operations: Sparse Convolution (SC) and Valid Sparse Convolution (VSC). Both techniques eliminate the "dilating" effect by preserving the set of active sites and introducing a modified convolution operation that only considers non-zero entries in the input.

The authors construct various network architectures using combinations of VSC and regular SC, applying these architectures to well-established models like VGG, ResNet, and DenseNet. This design allows the reduction of computational overhead while maintaining competitive performance levels.

Experimental Evaluation

The authors validate SSCNs on two datasets: the 2D CASIA Chinese handwriting dataset and the 3D ModelNet-40 CAD model dataset. The results indicate that VSC and SC convolutions not only reduce computational and memory requirements by up to 50% but also achieve state-of-the-art performance metrics. Specifically, in the CASIA dataset, VSC reduced computation by three- to four-fold compared to traditional convolutions, with only minimal accuracy loss.

Implications and Future Directions

The implications of this research are significant for the development of efficient deep learning models handling sparse data. SSCNs provide a scalable approach, making it feasible to process high-dimensional data without the typically prohibitive resource requirements. This method opens avenues for practical applications in areas requiring real-time processing, such as autonomous navigation and real-time medical image analysis.

In future exploration, combining SSCNs with specialized data structures like oct-trees could further optimize storage and computation. Additionally, deploying SSCNs in other domains characterized by sparse data could elucidate further enhancements and application-specific tuning opportunities.

The paper projects a comprehensive direction for reducing computational complexity in convolutional networks without sacrificing accuracy, presenting a foundational step towards efficient sparse data processing in neural networks.

PDF Markdown

Related Papers

YouTube

Show All Videos