Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Constructive Approach to Function Realization by Neural Stochastic Differential Equations (2307.00215v2)

Published 1 Jul 2023 in math.OC and cs.LG

Abstract: The problem of function approximation by neural dynamical systems has typically been approached in a top-down manner: Any continuous function can be approximated to an arbitrary accuracy by a sufficiently complex model with a given architecture. This can lead to high-complexity controls which are impractical in applications. In this paper, we take the opposite, constructive approach: We impose various structural restrictions on system dynamics and consequently characterize the class of functions that can be realized by such a system. The systems are implemented as a cascade interconnection of a neural stochastic differential equation (Neural SDE), a deterministic dynamical system, and a readout map. Both probabilistic and geometric (Lie-theoretic) methods are used to characterize the classes of functions realized by such systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,” Neural Networks, vol. 6, pp. 861–867, 1993.
  2. D. Yarotsky, “Error bounds for approximations with deep ReLU networks,” Neural Networks, vol. 94, pp. 103–114, 2017.
  3. M. Telgarsky, “Benefits of depth in neural networks,” in Conference on Learning Theory, 2016.
  4. M. W. Hirsch, “Convergent activation dynamics in continuous time networks,” Neural Networks, vol. 2, pp. 331–349, 1989.
  5. E. Haber and L. Ruthotto, “Stable architectures for deep neural networks,” Inverse Problems, vol. 34, no. 1, p. 014004, December 2017.
  6. W. E, “A proposal on machine learning via dynamical systems,” Communications in Mathematics and Statistics, vol. 5, pp. 1–11, 2017.
  7. R. T. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud, “Neural ordinary differential equations,” in NeurIPS, 2018.
  8. Q. Li, T. Lin, and Z. Shen, “Deep learning via dynamical systems: An approximation perspective,” 2019. [Online]. Available: http:arxiv.org/abs/1912.10382
  9. D. Ruiz-Balet and E. Zuazua, “Neural ODE control for classification, approximation and transport,” 2021. [Online]. Available: https://arxiv.org/abs/2104.05278
  10. P. Tabuada and B. Gharesifard, “Universal approximation power of deep neural networks via nonlinear control theory,” in ICLR, 2021.
  11. A. Agrachev and A. Sarychev, “Control on the manifolds of mappings with a view to the deep learning,” Journal of Dynamical and Control Systems, vol. 28, no. 4, pp. 989–1008, 2022.
  12. A. J. Krener, “A decomposition theory for differentiable systems,” SIAM Journal on Control and Optimization, vol. 15, no. 5, pp. 813–829, 1977.
  13. K. M. Choromanski, J. Q. Davis, V. Likhosherstov, X. Song, J.-J. Slotine, J. Varley, H. Lee, A. Weller, and V. Sindhwani, “Ode to an ODE,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33, 2020, pp. 3338–3350.
  14. E. Wong, “Stochastic neural networks,” Algorithmica, vol. 6, pp. 466–478, June 1991.
  15. B. Tzen and M. Raginsky, “Theoretical guarantees for sampling and inference in generative models with latent diffusions,” in COLT, 2019.
  16. ——, “Neural stochastic differential equations: Deep latent gaussian models in the diffusion limit,” 2019. [Online]. Available: http://arxiv.org/abs/1905.09883
  17. X. Li, T.-K. L. Wong, R. T. Q. Chen, and D. Duvenaud, “Scalable gradients for stochastic differential equations,” in AISTATS, 2020.
  18. H. Sussmann, “On generalized inputs and white noise,” in IEEE Conference on Decision and Control, 1976, pp. 809–814.
  19. A. R. Barron, “Universal approximation bounds for superpositions of a sigmoidal function,” IEEE Transactions on Information Theory, vol. 39, no. 3, pp. 930–945, 1993.
  20. T. Veeravalli and M. Raginsky, “Nonlinear controllability and function representation by neural stochastic differential equations,” in Conference on Learning for Dynamics and Control (L4DC), 2023.
  21. D. L. Elliott, “Diffusions on manifolds arising from controllable systems,” in Geometric Methods in System Theory, D. Q. Mayne and R. W. Brockett, Eds.   D. Reidel Publishing Co., 1973, pp. 285–294.
  22. R. W. Brockett, “Lie theory and control systems defined on spheres,” SIAM Journal on Applied Mathematics, vol. 25, no. 2, pp. 213–225, 1973.
  23. L. Gurvits and P. Koiran, “Approximation and learning of convex superpositions,” Journal of Computer and System Sciences, vol. 55, pp. 161–170, 1997.
  24. V. Jurdjevic and H. J. Sussmann, “Control systems on Lie groups,” Journal of Differential Equations, vol. 12, pp. 313–329, 1972.
  25. M. Freedman and J. Willems, “Smooth representation of systems with differentiated inputs,” IEEE Transactions on Automatic Control, vol. 23, no. 1, pp. 16–21, 1978.
  26. A. J. Krener and C. Lobry, “The complexity of stochastic differential equations,” Stochastics: An International Journal of Probability and Stochastic Processes, vol. 4, no. 3, pp. 193–203, 1981.

Summary

We haven't generated a summary for this paper yet.