Papers
Topics
Authors
Recent
2000 character limit reached

An experimental comparative study of backpropagation and alternatives for training binary neural networks for image classification (2408.04460v1)

Published 8 Aug 2024 in cs.LG

Abstract: Current artificial neural networks are trained with parameters encoded as floating point numbers that occupy lots of memory space at inference time. Due to the increase in the size of deep learning models, it is becoming very difficult to consider training and using artificial neural networks on edge devices. Binary neural networks promise to reduce the size of deep neural network models, as well as to increase inference speed while decreasing energy consumption. Thus, they may allow the deployment of more powerful models on edge devices. However, binary neural networks are still proven to be difficult to train using the backpropagation-based gradient descent scheme. This paper extends the work of \cite{crulis2023alternatives}, which proposed adapting to binary neural networks two promising alternatives to backpropagation originally designed for continuous neural networks, and experimented with them on simple image classification datasets. This paper proposes new experiments on the ImageNette dataset, compares three different model architectures for image classification, and adds two additional alternatives to backpropagation.

Summary

  • The paper compares backpropagation with several alternative training algorithms for Binary Neural Networks (BNNs) on image classification tasks using various architectures and datasets.
  • Experiments show backpropagation remains the most reliable method for modern architectures with features like skip-connections, though DFA shows competitive performance on simpler models like VGG-19.
  • The study highlights the accuracy trade-offs when binarizing weights and activations and suggests continued relevance for BP while exploring computational advantages of alternatives like DFA and DRTP for edge devices.

Overview of Binary Neural Network Training Methods

The paper presents a comparative analysis of various training methodologies for Binary Neural Networks (BNNs), with a focus on alternatives to the traditional backpropagation (BP) algorithm. BNNs are of significant interest due to their potential to reduce computational complexity, memory requirements, and energy consumption, thereby facilitating the deployment of neural networks on edge devices like smartphones. This study extends prior research by experimenting with more complex datasets, such as ImageNette, and incorporating additional alternative training algorithms.

Highlights

The research delineates several key insights:

  • Binary Neural Networks (BNNs): BNNs operate with parameters encoded at a single bit, which can drastically reduce model size and improve computational efficiency due to the use of low-level binary operations like XNOR. However, training BNNs using standard methods remains challenging due to the inherent approximations needed to handle non-differentiable binary units.
  • Training Algorithms: The study compares backpropagation with alternatives such as Direct Feedback Alignment (DFA), Direct Random Target Projection (DRTP), Hilbert Schmidt Independence Criterion (HSIC), and SigpropTL. Each alternative presents unique qualities, particularly in terms of biological plausibility and computational efficiency.
  • Experiments and Results: The experiments were conducted on several well-established deep learning architectures including VGG-19, MobileNet-V2, and MLP-Mixer, across various datasets. Results indicate that binary models trained using alternative methods often underperform compared to continuous models. For instance, while BP remains the most reliable method for training modern architectures with features like skip-connections, DFA showed competitive performance on architectures without such connections, like VGG-19.
  • Impact of Binarization: The research highlights a differential impact on model accuracy due to the binarization of weights and activations, with traditional methods like BP showing significant performance drops especially when weights are binarized.

Implications and Future Prospects

From a practical standpoint, the paper emphasizes the continued relevance of backpropagation for training BNNs, despite the promising contributions of alternatives in terms of memory efficiency and theoretical insights into training dynamics without recursive gradient computations. The findings suggest that while alternatives may offer computational advantages, their real-world applicability remains constrained by accuracy limitations.

Future research could explore optimizing these alternative algorithms and assess their scalability and performance in practical deployment scenarios on resource-constrained devices. Furthermore, these insights might stimulate advancements in hardware design tailored for BNNs, potentially creating synergies between software algorithms and hardware capabilities to maximize throughput and energy efficiency.

Conclusion

This comparative study offers critical insights into the effectiveness of alternative training methodologies for BNNs. While backpropagation continues to set the benchmark for model accuracy, alternatives such as DFA and DRTP merit consideration for specific architectures and applications where computational austerity is a priority. The ongoing examination of these methods could yield pathways to innovative solutions for deploying sophisticated neural networks in edge computing environments.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.