Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SpinalNet: Deep Neural Network with Gradual Input (2007.03347v3)

Published 7 Jul 2020 in cs.CV, cs.LG, cs.NE, and eess.IV

Abstract: Deep neural networks (DNNs) have achieved the state of the art performance in numerous fields. However, DNNs need high computation times, and people always expect better performance in a lower computation. Therefore, we study the human somatosensory system and design a neural network (SpinalNet) to achieve higher accuracy with fewer computations. Hidden layers in traditional NNs receive inputs in the previous layer, apply activation function, and then transfer the outcomes to the next layer. In the proposed SpinalNet, each layer is split into three splits: 1) input split, 2) intermediate split, and 3) output split. Input split of each layer receives a part of the inputs. The intermediate split of each layer receives outputs of the intermediate split of the previous layer and outputs of the input split of the current layer. The number of incoming weights becomes significantly lower than traditional DNNs. The SpinalNet can also be used as the fully connected or classification layer of DNN and supports both traditional learning and transfer learning. We observe significant error reductions with lower computational costs in most of the DNNs. Traditional learning on the VGG-5 network with SpinalNet classification layers provided the state-of-the-art (SOTA) performance on QMNIST, Kuzushiji-MNIST, EMNIST (Letters, Digits, and Balanced) datasets. Traditional learning with ImageNet pre-trained initial weights and SpinalNet classification layers provided the SOTA performance on STL-10, Fruits 360, Bird225, and Caltech-101 datasets. The scripts of the proposed SpinalNet are available at the following link: https://github.com/dipuk0506/SpinalNet

Citations (119)

Summary

  • The paper presents an innovative architecture inspired by human somatosensory processes that reduces network complexity by employing gradual input segmentation.
  • Experimental results demonstrate error rate reductions and state-of-the-art performance on datasets such as MNIST variants and STL-10 when replacing traditional fully connected layers.
  • The design alleviates the vanishing gradient issue and accelerates training by minimizing redundant computations, offering a scalable solution for resource-constrained applications.

An Overview of SpinalNet: A Deep Neural Network with Gradual Input

The paper "SpinalNet: Deep Neural Network with Gradual Input" presents a novel architecture designed to improve the computational efficiency and accuracy of traditional Deep Neural Networks (DNNs) by taking inspiration from the human somatosensory system. Classical DNN architectures operate by fully connecting layers in a linear fashion, which can result in a large number of parameters and subsequent computational overhead. SpinalNet addresses these issues by introducing a structure that incorporates gradual input processing at each layer, aiming to reduce complexity and achieve superior performance with less computational cost.

Core Contributions and Architecture

The proposed SpinalNet architecture departs from the conventional fully connected approach by dividing each hidden layer into three parts: input split, intermediate split, and output split. Unlike traditional frameworks where inputs are fully connected to the next layer, SpinalNet allows each layer to receive only a portion of the inputs along with outputs from the previous intermediate splits. By doing so, the number of required weights is significantly reduced, which leads to lower computational demands.

One of the standout features of SpinalNet is its methodological inspiration drawn from the human spinal cord and the somatosensory process, specifically optimizing how inputs are processed gradually across layers. This architectural choice is justified through theoretical underpinnings, including a proof of universal approximation for SpinalNet, thus affirming its capability to approximate a wide variety of functions akin to fully connected neural networks.

Performance Evaluations

Empirical evaluations showcase SpinalNet's efficacy across numerous datasets. The architecture is assessed both as an independent structure and as an enhancement to existing models like VGG, particularly in fully connected layers. Notably, SpinalNet demonstrates reduced error rates with fewer parameters and multiplications when applied to traditional learning models such as VGG-5 across MNIST-related datasets (Kuzushiji-MNIST, Fashion-MNIST, etc.).

Moreover, leveraging transferred initialization, SpinalNet achieves state-of-the-art (SOTA) performance across several datasets including STL-10, Fruits 360, Bird225, and Caltech-101, indicating its superior adaptability and effectiveness in transferring knowledge from pre-trained models. When combined with transfer learning, the architecture maintains its efficiency, consistently offering higher accuracy compared to conventional DNNs within similar parameter constraints.

Theoretical and Practical Implications

Theoretically, SpinalNet's architecture provides a robust framework for reducing the vanishing gradient problem and expediting neural network training by limiting redundant computations. This design effectively simplifies the network's depth without sacrificing expressiveness or learning capacity, establishing a novel paradigm in DNN architecture development.

Practically, SpinalNet holds potential for a broad range of applications where traditional DNNs are limited by computational power and memory constraints. The reduction in parameters without compromising accuracy means that this architecture could greatly benefit industries reliant on large-scale data processing and real-time analytics, such as autonomous systems and smart devices.

Speculation on Future Developments

Moving forward, further explorations could include broad applications of SpinalNet in real-world environments, adapting it for various neural network frameworks beyond traditional CNNs and VGGs, and tackling new learning scenarios such as zero-shot learning and adaptive hyperparameter tuning. Additionally, SpinalNet could be combined with ensemble techniques to potentially enhance model robustness and accuracy further.

In conclusion, the SpinalNet architecture poses a compelling alternative to existing neural network designs, promising significant improvements in computational efficiency and model performance. The comprehensive experimental results across various datasets underscore its potential as a practical and scalable solution for both academia and industry in advancing machine learning capabilities.

Github Logo Streamline Icon: https://streamlinehq.com