Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
92 tokens/sec
Gemini 2.5 Pro Premium
46 tokens/sec
GPT-5 Medium
19 tokens/sec
GPT-5 High Premium
32 tokens/sec
GPT-4o
87 tokens/sec
DeepSeek R1 via Azure Premium
98 tokens/sec
GPT OSS 120B via Groq Premium
435 tokens/sec
Kimi K2 via Groq Premium
207 tokens/sec
2000 character limit reached

Equidistribution-based training of Free Knot Splines and ReLU Neural Networks (2407.02153v2)

Published 2 Jul 2024 in cs.LG, cs.NA, and math.NA

Abstract: We consider the problem of univariate nonlinear function approximation using shallow neural networks (NN) with a rectified linear unit (ReLU) activation function. We show that the $L_2$ based approximation problem is ill-conditioned and the behaviour of optimisation algorithms used in training these networks degrades rapidly as the width of the network increases. This can lead to significantly poorer approximation in practice than expected from the theoretical expressivity of the ReLU architecture and traditional methods such as univariate Free Knot Splines (FKS). Univariate shallow ReLU NNs and FKS span the same function space, and thus have the same theoretical expressivity. However, the FKS representation remains well-conditioned as the number of knots increases. We leverage the theory of optimal piecewise linear interpolants to improve the training procedure for ReLU NNs. Using the equidistribution principle, we propose a two-level procedure for training the FKS by first solving the nonlinear problem of finding the optimal knot locations of the interpolating FKS, and then determine the optimal weights and knots of the FKS by solving a nearly linear, well-conditioned problem. The training of the FKS gives insights into how we can train a ReLU NN effectively, with an equally accurate approximation. We combine the training of the ReLU NN with an equidistribution-based loss to find the breakpoints of the ReLU functions. This is then combined with preconditioning the ReLU NN approximation to find the scalings of the ReLU functions. This fast, well-conditioned and reliable method finds an accurate shallow ReLU NN approximation to a univariate target function. We test this method on a series of regular, singular, and rapidly varying target functions and obtain good results, realising the expressivity of the shallow ReLU network in all cases. We then extend our results to deeper networks.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.