Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 133 tok/s Pro
Kimi K2 203 tok/s Pro
GPT OSS 120B 441 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Size and depth of monotone neural networks: interpolation and approximation (2207.05275v2)

Published 12 Jul 2022 in cs.LG, math.OC, and stat.ML

Abstract: We study monotone neural networks with threshold gates where all the weights (other than the biases) are non-negative. We focus on the expressive power and efficiency of representation of such networks. Our first result establishes that every monotone function over $[0,1]d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network. When $d > 3$, we improve upon the previous best-known construction which has depth $d+1$. Our proof goes by solving the monotone interpolation problem for monotone datasets using a depth-4 monotone threshold network. In our second main result we compare size bounds between monotone and arbitrary neural networks with threshold gates. We find that there are monotone real functions that can be computed efficiently by networks with no restriction on the gates whereas monotone networks approximating these functions need exponential size in the dimension.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. N. Alon and R. B. Boppana. The monotone circuit complexity of Boolean functions. Combinatorica, 7(1):1–22, 1987.
  2. Statistical inference under order restrictions: the theory and application of isotonic regression. Wiley, 1972.
  3. Andrew R Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 1993.
  4. Eric B. Baum. On the capabilities of multilayer perceptrons. J. Complexity, 4(3):193–215, 1988.
  5. Monotone circuits for monotone weighted threshold functions. Inform. Process. Lett., 97(1):12–18, 2006.
  6. Concentration inequalities: A nonasymptotic theory of independence. Oxford university press, 2013.
  7. Network size and size of the weights in memorization with two-layers neural networks. Advances in Neural Information Processing Systems, 33:4977–4986, 2020.
  8. Lower bounds for monotone arithmetic circuits via communication complexity. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 786–799, 2021.
  9. Addition is exponentially harder than counting for shallow monotone circuits. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 1232–1245, 2017.
  10. A local shape-preserving interpolation scheme for scattered data. Computer Aided Geometric Design, 16(5):385–405, 1999.
  11. George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
  12. Monotone and partially monotone neural networks. IEEE Transactions on Neural Networks, 21(6):906–917, 2010.
  13. Amit Daniely. Neural networks learning and memorization with (almost) no over-parameterization. Advances in Neural Information Processing Systems, 33:9007–9016, 2020.
  14. Guest column: Proofs, circuits, and communication. ACM SIGACT News, 53(1):59–82, 2022.
  15. Incorporating second-order functional knowledge for better option pricing. Advances in neural information processing systems, 13, 2000.
  16. The power of depth for feedforward neural networks. In Conference on learning theory, pages 907–940. PMLR, 2016.
  17. Monotone circuit lower bounds from resolution. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 902–911, 2018.
  18. Monotonic calibrated interpolated look-up tables. The Journal of Machine Learning Research, 17(1):3790–3836, 2016.
  19. Threshold circuits of bounded depth. Journal of Computer and System Sciences, 46(2):129–154, 1993.
  20. Nonparametric kernel regression subject to monotonicity constraints. The Annals of Statistics, 29(3):624–647, 2001.
  21. Higher lower bounds on monotone size. In Proceedings of the thirty-second annual ACM symposium on Theory of computing, pages 378–387, 2000.
  22. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
  23. A note on monotone real circuits. Inform. Process. Lett., 131:15–19, 2018.
  24. Some exact complexity results for straight-line computations over semirings. J. Assoc. Comput. Mach., 29(3):874–897, 1982.
  25. The isotron algorithm: High-dimensional isotonic regression. In COLT 2009 - The 22nd Conference on Learning Theory, Montreal, Quebec, Canada, 2009.
  26. Fast, provable algorithms for isotonic regression in all lpsubscript𝑙𝑝l_{p}italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT-norms. Advances in neural information processing systems, 28, 2015.
  27. Certified monotonic neural networks. Advances in Neural Information Processing Systems, 33:15427–15438, 2020.
  28. Size and depth of monotone neural networks: interpolation and approximation. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 5522–5534. Curran Associates, Inc., 2022.
  29. Fast and flexible monotonic functions with ensembles of lattices. Advances in neural information processing systems, 29, 2016.
  30. Bernt Ø ksendal. Stochastic differential equations. Universitext. Springer-Verlag, Berlin, sixth edition, 2003. An introduction with applications.
  31. Ian Parberry. Circuit complexity and neural networks. MIT press, 1994.
  32. Pavel Pudlák. Lower bounds for resolution and cutting plane proofs and monotone computations. The Journal of Symbolic Logic, 62(3):981–998, 1997.
  33. An exponential improvement on the memorization capacity of deep threshold networks. Advances in Neural Information Processing Systems, 34, 2021.
  34. Separation of the monotone NC hierarchy. In Proceedings 38th Annual Symposium on Foundations of Computer Science, pages 234–243. IEEE, 1997.
  35. Alexander A Razborov. Lower bounds on monotone complexity of the logical permanent. Mathematical Notes of the Academy of Sciences of the USSR, 37(6):485–493, 1985.
  36. Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
  37. Classes of feedforward neural networks and their circuit complexity. Neural networks, 5(6):971–977, 1992.
  38. Arithmetic circuits: A survey of recent results and open questions. Now Publishers Inc, 2010.
  39. Joseph Sill. Monotonic networks. Advances in neural information processing systems, 10, 1997.
  40. Michael Sipser. Introduction to the theory of computation. ACM Sigact News, 27(1):27–29, 1996.
  41. On the power of threshold circuits with small weights. SIAM Journal on Discrete Mathematics, 4(3):423–435, 1991.
  42. Counterexample-guided learning of monotonic neural networks. Advances in Neural Information Processing Systems, 33:11936–11948, 2020.
  43. É. Tardos. The gap between monotone and nonmonotone circuit complexity is exponential. Combinatorica, 8(1):141–142, 1988.
  44. Size and depth separation in approximating benign functions with neural networks. In Conference on Learning Theory, pages 4195–4223. PMLR, 2021.
  45. Roman Vershynin. Memory capacity of neural networks with threshold and rectified linear unit activations. SIAM Journal on Mathematics of Data Science, 2(4):1004–1033, 2020.
  46. Smoothing scattered data with a monotone powell-sabin spline surface. Numerical Algorithms, 12(1):215–231, 1996.
  47. Dmitry Yarotsky. Error bounds for approximations with deep ReLU networks. Neural Networks, 94:103–114, 2017.
  48. Amir Yehudayoff. Separating monotone VP and VNP. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 425–429, 2019.
  49. Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3):107–115, 2021.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: