A New Backpropagation Algorithm without Gradient Descent (1802.00027v1)

Published 25 Jan 2018 in cs.LG

Abstract: The backpropagation algorithm, which had been originally introduced in the 1970s, is the workhorse of learning in neural networks. This backpropagation algorithm makes use of the famous machine learning algorithm known as Gradient Descent, which is a first-order iterative optimization algorithm for finding the minimum of a function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. In this paper, we develop an alternative to the backpropagation without the use of the Gradient Descent Algorithm, but instead we are going to devise a new algorithm to find the error in the weights and biases of an artificial neuron using Moore-Penrose Pseudo Inverse. The numerical studies and the experiments performed on various datasets are used to verify the working of this alternative algorithm.

Citations (15)

View on Semantic Scholar

Summary

The paper presents an alternative neural network training method using the Moore-Penrose pseudoinverse instead of gradient descent.
It employs direct weight and bias corrections calculated via singular value decomposition, bypassing iterative optimization.
Empirical evaluations across standard datasets reveal varying accuracies, suggesting potential for further algorithmic refinement.

An Alternative Backpropagation Algorithm Using Moore-Penrose Pseudoinverse

The paper "A New Backpropagation Algorithm without Gradient Descent" by Varun Ranganathan and S. Natarajan presents an innovative approach to neural network training that circumvents the traditional use of Gradient Descent. This work introduces an algorithm to update weights and biases using the Moore-Penrose Pseudoinverse. This paper addresses the limitations of Gradient Descent, such as slow convergence and inefficiency near local minima, and proposes an alternative methodology.

Key Contributions

The primary contribution is the formulation of a backpropagation method that eliminates the need for the Gradient Descent algorithm. Instead, it leverages the Moore-Penrose Pseudoinverse to adjust the weights and biases during the training of Artificial Neural Networks (ANNs). This approach involves a modification to the neuron structure, assigning a unique bias to each input, which maintains the end result of the network unchanged but aligns with the pseudoinverse method for calculating weight and bias updates.

Algorithmic Framework

The proposed framework discards traditional iterative optimization in favor of direct calculation of weight and bias corrections using pseudoinverse. The updates are calculated by determining the difference between current and desired outputs, allowing for mathematical determination of necessary corrections through singular value decomposition when handling non-square matrices. Notably, the framework adapts to different input dimensions and employs standard activation functions with some constraints, particularly favoring those that are ReLU-like.

Empirical Evaluation

The research demonstrates the validity of this technique across various datasets, including the well-known Telling-Two-Spirals-Apart, Separating-Concentric-Circles, and XOR problem, as well as the Wisconsin Breast Cancer dataset. Numerical results indicate differing levels of success:

An accuracy of approximately 63% on the Two-Spirals problem, highlighting non-linearity handling.
Approximately 61% accuracy on the Concentric-Circles problem, constrained by activation function choice.
An 81% accuracy for the XOR problem.
A peak validation accuracy of 90.4% on the Wisconsin Breast Cancer dataset.

These results suggest that the method is functional though perhaps suboptimal with the chosen Softplus activation function.

Implications and Future Directions

The implications of this research are both practical and theoretical. Practically, it offers a potential alternative for scenarios where Gradient Descent's performance is subpar. Theoretically, it challenges prevailing assumptions regarding neuron adjustments on differentiable functions, suggesting a broader applicability when activation function domains and ranges are aligned.

Looking forward, further research may explore optimization of the algorithm to enhance efficiency and accuracy. The exploration of additional activation functions that could better synergize with the pseudoinverse approach is a prospective avenue. Such developments could see applications expand, potentially into fields like biomedical engineering where data asymmetry is common.

Conclusion

This paper delineates an intriguing and methodologically distinct approach to neural network training, providing a veritable alternative to traditional gradient-based methods. While it holds promise, its application breadth and capability would benefit from further refinement and testing across diverse datasets and use cases.

PDF Markdown

Related Papers

Convergence and Alignment of Gradient Descent with Random Backpropagation Weights (2021)
Faster Biological Gradient Descent Learning (2020)
The Golden Ratio of Learning and Momentum (2020)
Learning to solve the credit assignment problem (2019)
Proximal Backpropagation (2017)