Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 186 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 65 tok/s Pro
Kimi K2 229 tok/s Pro
GPT OSS 120B 441 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Expressive Power of ReLU and Step Networks under Floating-Point Operations (2401.15121v2)

Published 26 Jan 2024 in cs.LG and cs.AI

Abstract: The study of the expressive power of neural networks has investigated the fundamental limits of neural networks. Most existing results assume real-valued inputs and parameters as well as exact operations during the evaluation of neural networks. However, neural networks are typically executed on computers that can only represent a tiny subset of the reals and apply inexact operations, i.e., most existing results do not apply to neural networks used in practice. In this work, we analyze the expressive power of neural networks under a more realistic setup: when we use floating-point numbers and operations as in practice. Our first set of results assumes floating-point operations where the significand of a float is represented by finite bits but its exponent can take any integer value. Under this setup, we show that neural networks using a binary threshold unit or ReLU can memorize any finite input/output pairs and can approximate any continuous function within an arbitrary error. In particular, the number of parameters in our constructions for universal approximation and memorization coincides with that in classical results assuming exact mathematical operations. We also show similar results on memorization and universal approximation when floating-point operations use finite bits for both significand and exponent; these results are applicable to many popular floating-point formats such as those defined in the IEEE 754 standard (e.g., 32-bit single-precision format) and bfloat16.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. IEEE Standard for Floating-Point Arithmetic. Standard, IEEE Computer Society, 2019.
  2. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
  3. Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks. Journal of Machine Learning Research, 2019.
  4. E. B. Baum. On the capabilities of multilayer perceptrons. Journal of Complexity, 1988.
  5. S. Boldo. Stupid is as stupid does: Taking the square root of the square of a floating-point number. Electronic Notes in Theoretical Computer Science, 317:27--32, 2015.
  6. Floating-point arithmetic. Acta Numerica, 32:203--290, 2023.
  7. S. Boldo and G. Melquiond. Flocq: A unified library for proving floating-point algorithms in Coq. In IEEE Symposium on Computer Arithmetic (ARITH), pages 243--252, 2011.
  8. G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems (MCSS), 2(4):303--314, 1989.
  9. On the universal approximability and complexity bounds of quantized relu neural networks. In International Conference on Learning Representations (ICLR), 2019.
  10. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359--366, 1989.
  11. Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Transactions on Neural Networks, 1998.
  12. C. Jeannerod. Exploiting structure in floating-point arithmetic. In International Conference on Mathematical Aspects of Computer and Information Sciences (MACIS), pages 25--34, 2015.
  13. Sharp error bounds for complex floating-point inversion. Numerical Algorithms, 73(3):735--760, 2016.
  14. C. Jeannerod and S. M. Rump. On relative errors of floating-point operations: Optimal bounds and applications. Mathematics of Computation, 87(310):803--819, 2018.
  15. The expressive power of neural networks: A view from the width. In Annual Conference on Neural Information Processing Systems (NeurIPS), 2017.
  16. Handbook of floating-point arithmetic. Springer, 2018.
  17. Provable memorization via deep neural networks using sub-linear parameters. In Conference on Learning Theory (COLT), 2021.
  18. Minimum width for universal approximation. In International Conference on Learning Representations (ICLR), 2021.
  19. A. Pinkus. Approximation theory of the mlp model in neural networks. Acta Numerica, 8:143 -- 195, 1999.
  20. On practical constraints of approximation using neural networks on current digital computers. In IEEE 18th International Conference on Intelligent Engineering Systems, 2014.
  21. P. H. Sterbenz. Floating-point computation. Prentice Hall, 1973.
  22. On the optimal memorization power of RELU neural networks. In Conference on Learning Theory (COLT), 2022.
  23. R. Vershynin. Memory capacity of neural networks with threshold and rectified linear unit activations. SIAM Journal on Mathematics of Data Science, 2020.
  24. J. Wray and G. G. Green. Neural networks, approximation theory, and finite precision computation. Neural networks, 1995.
  25. D. Yarotsky. Optimal approximation of continuous functions by very deep ReLU networks. In Conference on Learning Theory (COLT), 2018.
  26. Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity. In Annual Conference on Neural Information Processing Systems (NeurIPS), 2019.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: