Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science (1707.04780v2)

Published 15 Jul 2017 in cs.NE, cs.AI, and cs.LG

Abstract: Through the success of deep learning in various domains, artificial neural networks are currently among the most used artificial intelligence methods. Taking inspiration from the network properties of biological neural networks (e.g. sparsity, scale-freeness), we argue that (contrary to general practice) artificial neural networks, too, should not have fully-connected layers. Here we propose sparse evolutionary training of artificial neural networks, an algorithm which evolves an initial sparse topology (Erd\H{o}s-R\'enyi random graph) of two consecutive layers of neurons into a scale-free topology, during learning. Our method replaces artificial neural networks fully-connected layers with sparse ones before training, reducing quadratically the number of parameters, with no decrease in accuracy. We demonstrate our claims on restricted Boltzmann machines, multi-layer perceptrons, and convolutional neural networks for unsupervised and supervised learning on 15 datasets. Our approach has the potential to enable artificial neural networks to scale up beyond what is currently possible.

PDF Abstract

Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity

This paper investigates scalable training methodologies for artificial neural networks (ANNs) by implementing adaptive sparse connectivity, drawing insights from network science. Traditional fully-connected layers in ANNs are substituted with sparse layers to optimize computational resources, aligning with the properties observed in biological neural networks, such as sparsity and scale-freeness.

Key Contributions

The authors introduce a Sparse Evolutionary Training (SET) procedure that evolves an initially sparse architecture, represented as an Erdős-Rényi random graph, into a scale-free topology through the training process. This method significantly reduces the number of parameters without sacrificing accuracy, demonstrated across various models including restricted Boltzmann machines (RBMs), multi-layer perceptrons (MLPs), and convolutional neural networks (CNNs).

Methodology

The SET procedure initiates with a sparse random topology and incorporates an evolutionary algorithm to adaptively adjust connections during training. Each training epoch removes weights closest to zero, replacing them with new random connections to maintain a constant parameter count. This process allows the topology to naturally evolve toward a scale-free distribution, reflecting robust and efficient characteristics akin to biological systems.

The paper evaluates SET on unsupervised and supervised learning tasks using 15 benchmark datasets. The sparse topology demonstrates a significant reduction in computational complexity with comparable or improved accuracy compared to densely connected counterparts.

Numerical Results

RBMs: SET-RBM outperforms fully-connected RBMs in 7 of 11 datasets, substantially reducing parameters. In the MNIST dataset, SET-RBM achieved -86.41 nats with a parameter reduction to 2%.
MLPs: SET-MLP consistently surpasses fully-connected MLPs while maintaining up to 99% fewer parameters, notably achieving 74.84% accuracy on CIFAR10 with significantly fewer parameters than traditional MLPs.
CNNs: SET-CNNs match or exceed the performance of conventional CNNs on image datasets like CIFAR10, while drastically reducing the number of connections.

Implications

The work suggests that incorporating adaptive, sparse connectivity during ANN design could lead to more efficient and scalable models. The quadratically reduced parameter count implies substantial savings in memory and computational resources. The approach could enable the handling of large-scale networks on modest hardware, presenting transformative potential for deep learning applications in constrained environments.

Future Directions

The authors propose future exploration into various initialization and pruning strategies to enhance the adaptability and efficiency of sparse networks. Additionally, improvements in hardware optimization for sparse operations could further leverage the benefits presented by SET.

The paper provides a promising framework for designing scalable neural networks by integrating principles from network science and emphasizes the potential for larger, more efficient models that move beyond current ANNs in complexity management and resource utilization.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Decebal Constantin Mocanu (52 papers)
Elena Mocanu (15 papers)
Peter Stone (184 papers)
Phuong H. Nguyen (9 papers)
Madeleine Gibescu (7 papers)
Antonio Liotta (27 papers)

Citations (573)

View on Semantic Scholar

Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science (1707.04780v2)