Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast and Exact Enumeration of Deep Networks Partitions Regions (2401.11188v1)

Published 20 Jan 2024 in cs.LG and cs.AI

Abstract: One fruitful formulation of Deep Networks (DNs) enabling their theoretical study and providing practical guidelines to practitioners relies on Piecewise Affine Splines. In that realm, a DN's input-mapping is expressed as per-region affine mapping where those regions are implicitly determined by the model's architecture and form a partition of their input space. That partition -- which is involved in all the results spanned from this line of research -- has so far only been computed on $2/3$-dimensional slices of the DN's input space or estimated by random sampling. In this paper, we provide the first parallel algorithm that does exact enumeration of the DN's partition regions. The proposed algorithm enables one to finally assess the closeness of the commonly employed approximations methods, e.g. based on random sampling of the DN input space. One of our key finding is that if one is only interested in regions with large'' volume, then uniform sampling of the space is highly efficient, but that if one is also interested in discovering thesmall'' regions of the partition, then uniform sampling is exponentially costly with the DN's input space dimension. On the other hand, our proposed method has complexity scaling linearly with input dimension and the number of regions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. “Deep learning,” nature, vol. 521, no. 7553, pp. 436–444, 2015.
  2. “In search of the real inductive bias: On the role of implicit regularization in deep learning,” arXiv preprint arXiv:1412.6614, 2014.
  3. “Eigenvalues of covariance matrices: Application to neural-network learning,” Physical Review Letters, vol. 66, no. 18, pp. 2396, 1991.
  4. “Neural tangent kernel: Convergence and generalization in neural networks,” arXiv preprint arXiv:1806.07572, 2018.
  5. “Why do deep residual networks generalize better than deep feedforward networks?—a neural tangent kernel perspective,” Advances in neural information processing systems, vol. 33, pp. 2698–2709, 2020.
  6. “The implicit bias of gradient descent on separable data,” The Journal of Machine Learning Research, vol. 19, no. 1, pp. 2822–2878, 2018.
  7. “A spline theory of deep learning,” in International Conference on Machine Learning. PMLR, 2018, pp. 374–383.
  8. “Singular value perturbation and deep network optimization,” arXiv preprint arXiv:2203.03099, 2022.
  9. “From hard to soft: Understanding deep network nonlinearities via vector quantization and statistical inference,” arXiv preprint arXiv:1810.09274, 2018.
  10. “Analytical probability distributions and exact expectation-maximization for deep generative networks,” Advances in neural information processing systems, vol. 33, pp. 14938–14949, 2020.
  11. “Polarity sampling: Quality and diversity control of pre-trained generative networks via singular values,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10641–10650.
  12. “Nonlinear approximation and (deep) relu networks,” Constructive Approximation, vol. 55, no. 1, pp. 127–172, 2022.
  13. “Batch normalization explained,” arXiv preprint arXiv:2209.14778, 2022.
  14. “Towards fast computation of certified robustness for relu networks,” in International Conference on Machine Learning. PMLR, 2018, pp. 5276–5285.
  15. “Semidefinite relaxations for certifying robustness to adversarial examples,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  16. Herbert Edelsbrunner, Algorithms in combinatorial geometry, vol. 10, Springer Science & Business Media, 1987.
  17. “Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective,” arXiv preprint arXiv:2102.11535, 2021.
  18. Applied dynamic programming, vol. 2050, Princeton university press, 2015.
  19. Mario Köppen, “The curse of dimensionality,” in 5th online world conference on soft computing in industrial applications (WSC5), 2000, vol. 1, pp. 4–8.
  20. “On the number of linear regions of deep neural networks,” Advances in neural information processing systems, vol. 27, 2014.
  21. “The geometry of deep networks: Power diagram subdivision,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  22. “Reverse search for enumeration,” Discrete applied mathematics, vol. 65, no. 1-3, pp. 21–46, 1996.
  23. Richard P Stanley et al., “An introduction to hyperplane arrangements,” Geometric combinatorics, vol. 13, no. 389-496, pp. 24, 2004.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets