Size and depth of monotone neural networks: interpolation and approximation (2207.05275v2)
Abstract: We study monotone neural networks with threshold gates where all the weights (other than the biases) are non-negative. We focus on the expressive power and efficiency of representation of such networks. Our first result establishes that every monotone function over $[0,1]d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network. When $d > 3$, we improve upon the previous best-known construction which has depth $d+1$. Our proof goes by solving the monotone interpolation problem for monotone datasets using a depth-4 monotone threshold network. In our second main result we compare size bounds between monotone and arbitrary neural networks with threshold gates. We find that there are monotone real functions that can be computed efficiently by networks with no restriction on the gates whereas monotone networks approximating these functions need exponential size in the dimension.
- N. Alon and R. B. Boppana. The monotone circuit complexity of Boolean functions. Combinatorica, 7(1):1–22, 1987.
- Statistical inference under order restrictions: the theory and application of isotonic regression. Wiley, 1972.
- Andrew R Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 1993.
- Eric B. Baum. On the capabilities of multilayer perceptrons. J. Complexity, 4(3):193–215, 1988.
- Monotone circuits for monotone weighted threshold functions. Inform. Process. Lett., 97(1):12–18, 2006.
- Concentration inequalities: A nonasymptotic theory of independence. Oxford university press, 2013.
- Network size and size of the weights in memorization with two-layers neural networks. Advances in Neural Information Processing Systems, 33:4977–4986, 2020.
- Lower bounds for monotone arithmetic circuits via communication complexity. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 786–799, 2021.
- Addition is exponentially harder than counting for shallow monotone circuits. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, pages 1232–1245, 2017.
- A local shape-preserving interpolation scheme for scattered data. Computer Aided Geometric Design, 16(5):385–405, 1999.
- George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
- Monotone and partially monotone neural networks. IEEE Transactions on Neural Networks, 21(6):906–917, 2010.
- Amit Daniely. Neural networks learning and memorization with (almost) no over-parameterization. Advances in Neural Information Processing Systems, 33:9007–9016, 2020.
- Guest column: Proofs, circuits, and communication. ACM SIGACT News, 53(1):59–82, 2022.
- Incorporating second-order functional knowledge for better option pricing. Advances in neural information processing systems, 13, 2000.
- The power of depth for feedforward neural networks. In Conference on learning theory, pages 907–940. PMLR, 2016.
- Monotone circuit lower bounds from resolution. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 902–911, 2018.
- Monotonic calibrated interpolated look-up tables. The Journal of Machine Learning Research, 17(1):3790–3836, 2016.
- Threshold circuits of bounded depth. Journal of Computer and System Sciences, 46(2):129–154, 1993.
- Nonparametric kernel regression subject to monotonicity constraints. The Annals of Statistics, 29(3):624–647, 2001.
- Higher lower bounds on monotone size. In Proceedings of the thirty-second annual ACM symposium on Theory of computing, pages 378–387, 2000.
- Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
- A note on monotone real circuits. Inform. Process. Lett., 131:15–19, 2018.
- Some exact complexity results for straight-line computations over semirings. J. Assoc. Comput. Mach., 29(3):874–897, 1982.
- The isotron algorithm: High-dimensional isotonic regression. In COLT 2009 - The 22nd Conference on Learning Theory, Montreal, Quebec, Canada, 2009.
- Fast, provable algorithms for isotonic regression in all lpsubscript𝑙𝑝l_{p}italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT-norms. Advances in neural information processing systems, 28, 2015.
- Certified monotonic neural networks. Advances in Neural Information Processing Systems, 33:15427–15438, 2020.
- Size and depth of monotone neural networks: interpolation and approximation. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 5522–5534. Curran Associates, Inc., 2022.
- Fast and flexible monotonic functions with ensembles of lattices. Advances in neural information processing systems, 29, 2016.
- Bernt Ø ksendal. Stochastic differential equations. Universitext. Springer-Verlag, Berlin, sixth edition, 2003. An introduction with applications.
- Ian Parberry. Circuit complexity and neural networks. MIT press, 1994.
- Pavel Pudlák. Lower bounds for resolution and cutting plane proofs and monotone computations. The Journal of Symbolic Logic, 62(3):981–998, 1997.
- An exponential improvement on the memorization capacity of deep threshold networks. Advances in Neural Information Processing Systems, 34, 2021.
- Separation of the monotone NC hierarchy. In Proceedings 38th Annual Symposium on Foundations of Computer Science, pages 234–243. IEEE, 1997.
- Alexander A Razborov. Lower bounds on monotone complexity of the logical permanent. Mathematical Notes of the Academy of Sciences of the USSR, 37(6):485–493, 1985.
- Understanding machine learning: From theory to algorithms. Cambridge university press, 2014.
- Classes of feedforward neural networks and their circuit complexity. Neural networks, 5(6):971–977, 1992.
- Arithmetic circuits: A survey of recent results and open questions. Now Publishers Inc, 2010.
- Joseph Sill. Monotonic networks. Advances in neural information processing systems, 10, 1997.
- Michael Sipser. Introduction to the theory of computation. ACM Sigact News, 27(1):27–29, 1996.
- On the power of threshold circuits with small weights. SIAM Journal on Discrete Mathematics, 4(3):423–435, 1991.
- Counterexample-guided learning of monotonic neural networks. Advances in Neural Information Processing Systems, 33:11936–11948, 2020.
- É. Tardos. The gap between monotone and nonmonotone circuit complexity is exponential. Combinatorica, 8(1):141–142, 1988.
- Size and depth separation in approximating benign functions with neural networks. In Conference on Learning Theory, pages 4195–4223. PMLR, 2021.
- Roman Vershynin. Memory capacity of neural networks with threshold and rectified linear unit activations. SIAM Journal on Mathematics of Data Science, 2(4):1004–1033, 2020.
- Smoothing scattered data with a monotone powell-sabin spline surface. Numerical Algorithms, 12(1):215–231, 1996.
- Dmitry Yarotsky. Error bounds for approximations with deep ReLU networks. Neural Networks, 94:103–114, 2017.
- Amir Yehudayoff. Separating monotone VP and VNP. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 425–429, 2019.
- Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3):107–115, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.