Polynomial Regression As an Alternative to Neural Nets (1806.06850v3)

Published 13 Jun 2018 in cs.LG and stat.ML

Abstract: Despite the success of neural networks (NNs), there is still a concern among many over their "black box" nature. Why do they work? Here we present a simple analytic argument that NNs are in fact essentially polynomial regression models. This view will have various implications for NNs, e.g. providing an explanation for why convergence problems arise in NNs, and it gives rough guidance on avoiding overfitting. In addition, we use this phenomenon to predict and confirm a multicollinearity property of NNs not previously reported in the literature. Most importantly, given this loose correspondence, one may choose to routinely use polynomial models instead of NNs, thus avoiding some major problems of the latter, such as having to set many tuning parameters and dealing with convergence issues. We present a number of empirical results; in each case, the accuracy of the polynomial approach matches or exceeds that of NN approaches. A many-featured, open-source software package, polyreg, is available.

Citations (76)

View on Semantic Scholar

Summary

The paper reinterprets neural networks as high-degree polynomial models, offering a theoretical framework aligned with the Stone-Weierstrass Theorem.
Empirical results on datasets like MNIST and NYC Taxi reveal that polynomial regression can match or exceed neural net performance with fewer hyperparameters.
The findings imply that employing polynomial regression may simplify model design and enhance predictive analytics by reducing computational complexity.

Analysis of "Polynomial Regression as an Alternative to Neural Nets"

This paper challenges the predominant reliance on neural networks (NNs) by advocating polynomial regression (PR) as an effective and often more practical alternative. The authors present a case that neural networks can be fundamentally perceived as polynomial regression models under certain conditions. This perspective offers a new understanding of neural networks and prompts reconsideration of traditional methods in predictive analytics.

Theoretical Insights: NN and Polynomial Regression Correspondence

The authors provide an analytical viewpoint wherein neural networks with activation functions are analogous to polynomial regression models. This analogy is established by examining layers of NNs as constructing higher-degree polynomial expressions. The degree increments with each layer, thereby forming more complex polynomial models. This proposition aligns with the famous Stone-Weierstrass Theorem, suggesting that any continuous function can be approximated by polynomials, thus reinforcing the possibility of perceiving NNs in polynomial terms.

The NN $\leftrightarrow$ PR principle introduced in the paper suggests that with proper understanding and implementation, polynomial regression could serve as a direct replacement for NNs, potentially simplifying model configurations and mitigating common NN challenges like multicollinearity or convergence issues.

Empirical Evaluations: Performance Comparison

The authors empirically validate polynomial regression against neural networks across various datasets, comprising both regression and classification tasks. Prominent datasets include Million Song, NYC Taxi, MNIST, and several large-scale data from biomedical research. Consistently, results highlight that polynomial regression demonstrates a competitive, if not superior, performance relative to neural networks on these tasks.

A striking observation arises from the author's findings: polynomial regression, with fewer hyperparameters and simpler architectural demands, can achieve prediction accuracy on par with or better than neural networks. This primarily surfaces from the realization that many NN models are overparameterized, resulting in fitting an excessively high-degree polynomial when mapped to the polynomial equivalent, leading to overfitting.

Implications and Future Directions

The paper's conclusions suggest significant implications for how machine learning practitioners might approach predictive modeling. Applying polynomial regression, especially in scenarios where datasets are well-structured and feature interactions are sufficiently explored, can lead to reduced computational overhead and simpler model setups.

Furthermore, the discovery of multicollinearity within neural network layers, resulting from their polynomial nature, paves the way for refined computational diagnostics and strategies such as regularization to improve NN training stability.

Practical Recommendations and Research Outlook

Given the equivalence between NNs and PR under the discussed framework, a systematic exploration of polynomial regression as a more intuitive starting point for model training is recommended. The paper encourages continued expansion of PR capabilities to encompass more complex tasks traditionally reserved for neural networks.

Moreover, as the field advances, integrating polynomial approaches into more specialized ML applications, such as image recognition with convolutional preprocessing, represents a promising direction for future research. The ongoing challenge remains adapting polynomial regression to efficiently handle high-dimensional data, potentially necessitating advancements in memory management and computational strategies.

In conclusion, the paper makes a compelling case for the reconsideration of polynomial regression amidst the AI community, advocating it as an understated yet powerful tool that promises significant value, both theoretically and practically, across various domains of machine learning and beyond.

PDF Markdown

Related Papers

Tweets

https://twitter.com/TravisMWhitaker/status/1787515224335298949

https://twitter.com/TravisMWhitaker/status/1785807577085022495

https://twitter.com/matloff/status/1789518450786029829

YouTube

Show All Videos