Log Neural Controlled Differential Equations: The Lie Brackets Make a Difference (2402.18512v3)
Abstract: The vector field of a controlled differential equation (CDE) describes the relationship between a control path and the evolution of a solution path. Neural CDEs (NCDEs) treat time series data as observations from a control path, parameterise a CDE's vector field using a neural network, and use the solution path as a continuously evolving hidden state. As their formulation makes them robust to irregular sampling rates, NCDEs are a powerful approach for modelling real-world data. Building on neural rough differential equations (NRDEs), we introduce Log-NCDEs, a novel, effective, and efficient method for training NCDEs. The core component of Log-NCDEs is the Log-ODE method, a tool from the study of rough paths for approximating a CDE's solution. Log-NCDEs are shown to outperform NCDEs, NRDEs, the linear recurrent unit, S5, and MAMBA on a range of multivariate time series datasets with up to $50{,}000$ observations.
- Jimmy T.H. Smith, Andrew Warrington and Scott W. Linderman “Simplified State Space Layers for Sequence Modeling” In International Conference on Learning Representations, 2023
- “Resurrecting Recurrent Neural Networks for Long Sequences”, 2023 arXiv:2303.06349 [cs.LG]
- “Neural Controlled Differential Equations for Irregular Time Series” In Advances in Neural Information Processing Systems, 2020
- Terry Lyons “Differential Equations Driven by Rough Signals (I): An Extension of an Inequality of L. C. Young” In Mathematical Research Letters 1, 1994, pp. 451–464
- Patrick Kidger “On Neural Differential Equations”, 2022 arXiv:2202.02435 [cs.LG]
- “Neural Ordinary Differential Equations.” In NeurIPS, 2018, pp. 6572–6583 URL: http://dblp.uni-trier.de/db/conf/nips/nips2018.html#ChenRBD18
- “Dimension-free Euler estimates of rough differential equations” In Revue Roumaine des Mathematiques Pures et Appliquees 59, 2013, pp. 25–53
- “Neural Rough Differential Equations for Long Time Series” In International Conference on Machine Learning, 2021
- S. Roman “Advanced Linear Algebra”, Graduate Texts in Mathematics Springer New York, 2007 URL: https://books.google.co.uk/books?id=bSyQr-wUys8C
- C. Reutenauer “Free Lie Algebras”, LMS monographs Clarendon Press, 1993 URL: https://books.google.co.uk/books?id=cBvvAAAAMAAJ
- T.J. Lyons, M. Caruana and T. Lévy “Differential Equations Driven by Rough Paths: École D’été de Probabilités de Saint-Flour XXXIV-2004”, Differential Equations Driven by Rough Paths: École D’été de Probabilités de Saint-Flour XXXIV-2004 no. 1908 Springer, 2007 URL: https://books.google.co.uk/books?id=WfgZAQAAIAAJ
- Terry Lyons “Rough paths, Signatures and the modelling of functions on streams” arXiv, 2014 DOI: 10.48550/ARXIV.1405.4537
- Imanol Perez Arribas “Derivatives pricing using signature payoffs” arXiv, 2018 DOI: 10.48550/ARXIV.1809.09466
- Rimhak Ree “Lie Elements and an Algebra Associated With Shuffles” In Annals of Mathematics 68, 1958, pp. 210
- “A Primer on the Signature Method in Machine Learning” arXiv, 2016 DOI: 10.48550/ARXIV.1603.03788
- “A New Proof of the Existence of Free Lie Algebras and an Application” In ISRN Algebra 2011, 2011 DOI: 10.5402/2011/247403
- Kuo-Tsai Chen “Integration of Paths, Geometric Invariants and a Generalized Baker- Hausdorff Formula” In Annals of Mathematics 65, 1957, pp. 163
- Alexander Kirillov “An Introduction to Lie Groups and Lie Algebras”, Cambridge Studies in Advanced Mathematics Cambridge University Press, 2008 DOI: 10.1017/CBO9780511755156
- Christian Bayer, Simon Breneis and Terry Lyons “An Adaptive Algorithm for Rough Differential Equations”, 2023 arXiv:2307.12590 [math.NA]
- Thomas Cass, Christian Litterer and Terry Lyons “New Trends in Stochastic Analysis and Related Topics: A Volume in Honour of Professor K. D. Elworthy”, Interdisciplinary mathematical sciences World Scientific, 2012 URL: https://books.google.co.uk/books?id=DCBqDQAAQBAJ
- Stefan Elfwing, Eiji Uchibe and Kenji Doya “Sigmoid-weighted linear units for neural network function approximation in reinforcement learning” Special issue on deep reinforcement learning In Neural Networks 107, 2018, pp. 3–11 DOI: https://doi.org/10.1016/j.neunet.2017.12.012
- Geoffrey E. Hinton “Learning Translation Invariant Recognition in Massively Parallel Networks” In Proceedings of the Parallel Architectures and Languages Europe, Volume I: Parallel Architectures PARLE Berlin, Heidelberg: Springer-Verlag, 1987, pp. 1–13
- Anders Krogh and John A. Hertz “A Simple Weight Decay Can Improve Generalization” In Proceedings of the 4th International Conference on Neural Information Processing Systems, NIPS’91 Denver, Colorado: Morgan Kaufmann Publishers Inc., 1991, pp. 950–957
- “Spectral Norm Regularization for Improving the Generalizability of Deep Learning” In ArXiv abs/1705.10941, 2017
- Marshall Hall “A basis for free Lie rings and higher commutators in free groups” In Proceedings of the American Mathematical Society, 1950
- “Fundamentals of Forward and Reverse” In Evaluating Derivatives, 2000, pp. 31–59 URL: https://epubs.siam.org/doi/abs/10.1137/1.9780898717761.ch3
- “Variable Step Size Control in the Numerical Solution of Stochastic Differential Equations” In SIAM Journal on Applied Mathematics 57.5, 1997, pp. 1455–1484 DOI: 10.1137/S0036139995286515
- Albert Gu, Karan Goel and Christopher Ré “Efficiently Modeling Long Sequences with Structured State Spaces” In International Conference on Learning Representations, 2022
- Elias M. Stein “Singular Integrals and Differentiability Properties of Functions (PMS-30)” Princeton University Press, 1970 URL: http://www.jstor.org/stable/j.ctt1bpmb07
- L.C. Young “An inequality of the Hölder type, connected with Stieltjes integration” In Acta Mathematica 67, 1936, pp. 251–282
- Terry J. Lyons “Differential equations driven by rough signals.” In Revista Matemática Iberoamericana 14.2, 1998, pp. 215–310 URL: http://eudml.org/doc/39555
- “The signature of a rough path: Uniqueness” In Advances in Mathematics 293, 2016, pp. 720–737 DOI: https://doi.org/10.1016/j.aim.2016.02.011
- “Path Integration Along Rough Paths” In System Control and Rough Paths Oxford University Press, 2002 DOI: 10.1093/acprof:oso/9780198506485.003.0005
- Benjamin Walker (11 papers)
- Andrew D. McLeod (8 papers)
- Tiexin Qin (13 papers)
- Yichuan Cheng (3 papers)
- Haoliang Li (67 papers)
- Terry Lyons (99 papers)