Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automatically Learning Hybrid Digital Twins of Dynamical Systems (2410.23691v1)

Published 31 Oct 2024 in cs.LG

Abstract: Digital Twins (DTs) are computational models that simulate the states and temporal dynamics of real-world systems, playing a crucial role in prediction, understanding, and decision-making across diverse domains. However, existing approaches to DTs often struggle to generalize to unseen conditions in data-scarce settings, a crucial requirement for such models. To address these limitations, our work begins by establishing the essential desiderata for effective DTs. Hybrid Digital Twins ($\textbf{HDTwins}$) represent a promising approach to address these requirements, modeling systems using a composition of both mechanistic and neural components. This hybrid architecture simultaneously leverages (partial) domain knowledge and neural network expressiveness to enhance generalization, with its modular design facilitating improved evolvability. While existing hybrid models rely on expert-specified architectures with only parameters optimized on data, $\textit{automatically}$ specifying and optimizing HDTwins remains intractable due to the complex search space and the need for flexible integration of domain priors. To overcome this complexity, we propose an evolutionary algorithm ($\textbf{HDTwinGen}$) that employs LLMs to autonomously propose, evaluate, and optimize HDTwins. Specifically, LLMs iteratively generate novel model specifications, while offline tools are employed to optimize emitted parameters. Correspondingly, proposed models are evaluated and evolved based on targeted feedback, enabling the discovery of increasingly effective hybrid models. Our empirical results reveal that HDTwinGen produces generalizable, sample-efficient, and evolvable models, significantly advancing DTs' efficacy in real-world applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. F. Tao, J. Cheng, Q. Qi, M. Zhang, H. Zhang, and F. Sui, “Digital twin-driven product design, manufacturing and service with big data,” The International Journal of Advanced Manufacturing Technology, vol. 94, pp. 3563–3576, 2018.
  2. J. Corral-Acero, F. Margara, M. Marciniak, C. Rodero, F. Loncaric, Y. Feng, A. Gilbert, J. F. Fernandes, H. A. Bukhari, A. Wajdan et al., “The ‘digital twin’to enable the vision of precision cardiology,” European heart journal, vol. 41, no. 48, pp. 4556–4564, 2020.
  3. J. Ladyman, J. Lambert, and K. Wiesner, “What is a complex system?” European Journal for Philosophy of Science, vol. 3, pp. 33–67, 2013.
  4. Q. Qi and F. Tao, “Digital twin and big data towards smart manufacturing and industry 4.0: 360 degree comparison,” Ieee Access, vol. 6, pp. 3585–3593, 2018.
  5. V. Iranzo and S. Pérez-González, “Epidemiological models and covid-19: a comparative view,” History and Philosophy of the Life Sciences, vol. 43, no. 3, p. 104, 2021.
  6. I. Bozic, J. G. Reiter, B. Allen, T. Antal, K. Chatterjee, P. Shah, Y. S. Moon, A. Yaqubie, N. Kelly, D. T. Le et al., “Evolutionary dynamics of cancer in response to targeted combination therapy,” elife, vol. 2, p. e00747, 2013.
  7. R. Rosen, G. Von Wichert, G. Lo, and K. D. Bettenhausen, “About the importance of autonomy and digital twins for the future of manufacturing,” Ifac-papersonline, vol. 48, no. 3, pp. 567–572, 2015.
  8. T. Erol, A. F. Mendi, and D. Doğan, “The digital twin revolution in healthcare,” in 2020 4th international symposium on multidisciplinary studies and innovative technologies (ISMSIT).   IEEE, 2020, pp. 1–7.
  9. J. R. Koza, “Genetic programming as a means for programming computers by natural selection,” Statistics and computing, vol. 4, pp. 87–112, 1994.
  10. S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Discovering governing equations from data by sparse identification of nonlinear dynamical systems,” Proceedings of the national academy of sciences, vol. 113, no. 15, pp. 3932–3937, 2016.
  11. D. Ha and J. Schmidhuber, “Recurrent world models facilitate policy evolution,” Advances in neural information processing systems, vol. 31, 2018.
  12. J. Yoon, D. Jarrett, and M. Van der Schaar, “Time-series generative adversarial networks,” Advances in neural information processing systems, vol. 32, 2019.
  13. R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Advances in neural information processing systems, vol. 31, 2018.
  14. M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational physics, vol. 378, pp. 686–707, 2019.
  15. L. Faure, B. Mollet, W. Liebermeister, and J.-L. Faulon, “A neural-mechanistic hybrid approach improving the predictive power of genome-scale metabolic models,” Nature Communications, vol. 14, no. 1, p. 4669, 2023.
  16. J. Pinto, J. R. Ramos, R. S. Costa, and R. Oliveira, “A general hybrid modeling framework for systems biology applications: Combining mechanistic knowledge with deep neural networks under the sbml standard,” AI, vol. 4, no. 1, pp. 303–318, 2023.
  17. P. Wang, Z. Zhu, W. Liang, L. Liao, and J. Wan, “Hybrid mechanistic and neural network modeling of nuclear reactors,” Energy, vol. 282, p. 128931, 2023.
  18. R. Cheng, A. Verma, G. Orosz, S. Chaudhuri, Y. Yue, and J. Burdick, “Control regularization for reduced variance reinforcement learning,” in International Conference on Machine Learning.   PMLR, 2019, pp. 1141–1150.
  19. E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin, “Large-scale evolution of image classifiers,” in International conference on machine learning.   PMLR, 2017, pp. 2902–2911.
  20. T. N. Mundhenk, M. Landajuela, R. Glatt, C. P. Santiago, D. M. Faissol, and B. K. Petersen, “Symbolic regression via neural-guided genetic programming population seeding,” in Proceedings of the 35th International Conference on Neural Information Processing Systems, 2021, pp. 24 912–24 923.
  21. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020.
  22. J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in neural information processing systems, vol. 35, pp. 24 824–24 837, 2022.
  23. A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, P. Barham, H. W. Chung, C. Sutton, S. Gehrmann et al., “Palm: Scaling language modeling with pathways,” Journal of Machine Learning Research, vol. 24, no. 240, pp. 1–113, 2023.
  24. H. A. Simon, “The architecture of complexity,” Proceedings of the American philosophical society, vol. 106, no. 6, pp. 467–482, 1962.
  25. T. L. Rogers, B. J. Johnson, and S. B. Munch, “Chaos is not rare in natural ecosystems,” Nature Ecology & Evolution, vol. 6, no. 8, pp. 1105–1111, 2022.
  26. M. Sokolov, M. von Stosch, H. Narayanan, F. Feidl, and A. Butté, “Hybrid modeling—a key enabler towards realizing digital twins in biopharma?” Current Opinion in Chemical Engineering, vol. 34, p. 100715, 2021.
  27. S. Chaudhuri, K. Ellis, O. Polozov, R. Singh, A. Solar-Lezama, Y. Yue et al., “Neurosymbolic programming,” Foundations and Trends® in Programming Languages, vol. 7, no. 3, pp. 158–243, 2021.
  28. A. Tsoularis and J. Wallace, “Analysis of logistic growth models,” Mathematical biosciences, vol. 179, no. 1, pp. 21–55, 2002.
  29. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” Advances in neural information processing systems, vol. 32, 2019.
  30. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  31. J. H. Holland, “Genetic algorithms,” Scientific american, vol. 267, no. 1, pp. 66–73, 1992.
  32. L. R. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989.
  33. R. Kalman, “A new approach to linear filtering and prediction problems,” Trans. ASME, D, vol. 82, pp. 35–44, 1960.
  34. L. Li, Y. Zhao, D. Jiang, Y. Zhang, F. Wang, I. Gonzalez, E. Valentin, and H. Sahli, “Hybrid deep neural network–hidden markov model (dnn-hmm) based speech emotion recognition,” in 2013 Humaine association conference on affective computing and intelligent interaction.   IEEE, 2013, pp. 312–317.
  35. R. G. Krishnan, U. Shalit, and D. Sontag, “Deep kalman filters,” arXiv preprint arXiv:1511.05121, 2015.
  36. J. L. Elman, “Finding structure in time,” Cognitive science, vol. 14, no. 2, pp. 179–211, 1990.
  37. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  38. K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
  39. I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” Advances in neural information processing systems, vol. 27, 2014.
  40. D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
  41. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  42. E. Dupont, A. Doucet, and Y. W. Teh, “Augmented neural odes,” Advances in neural information processing systems, vol. 32, 2019.
  43. S. I. Holt, Z. Qian, and M. van der Schaar, “Neural laplace: Learning diverse classes of differential equations in the laplace domain,” in International Conference on Machine Learning.   PMLR, 2022, pp. 8811–8832.
  44. M. A. Zaytar and C. El Amrani, “Sequence to sequence weather forecasting with long short-term memory recurrent neural networks,” International Journal of Computer Applications, vol. 143, no. 11, pp. 7–11, 2016.
  45. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  46. L. Sehovac and K. Grolinger, “Deep learning for load forecasting: Sequence to sequence recurrent neural networks with attention,” Ieee Access, vol. 8, pp. 36 411–36 426, 2020.
  47. S. Holt, A. Hüyük, Z. Qian, H. Sun, and M. van der Schaar, “Neural laplace control for continuous-time delayed systems,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2023, pp. 1747–1778.
  48. M. Schmidt and H. Lipson, “Distilling free-form natural laws from experimental data,” science, vol. 324, no. 5923, pp. 81–85, 2009.
  49. Z. Qian, K. Kacprzyk, and M. van der Schaar, “D-CODE: Discovering closed-form ODEs from observed trajectories,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=wENMvIsxNN
  50. K. Kacprzyk, Z. Qian, and M. van der Schaar, “D-cipher: discovery of closed-form partial differential equations,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  51. K. Kacprzyk, T. Liu, and M. van der Schaar, “Towards transparent time series forecasting,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=TYXtXLYHpR
  52. S. Cuomo, V. S. Di Cola, F. Giampaolo, G. Rozza, M. Raissi, and F. Piccialli, “Scientific machine learning through physics–informed neural networks: Where we are and what’s next,” Journal of Scientific Computing, vol. 92, no. 3, p. 88, 2022.
  53. S. Greydanus, M. Dzamba, and J. Yosinski, “Hamiltonian neural networks,” Advances in neural information processing systems, vol. 32, 2019.
  54. M. Cranmer, S. Greydanus, S. Hoyer, P. Battaglia, D. Spergel, and S. Ho, “Lagrangian neural networks,” arXiv preprint arXiv:2003.04630, 2020.
  55. Y. Yin, V. Le Guen, J. Dona, E. de Bézenac, I. Ayed, N. Thome, and P. Gallinari, “Augmenting physical models with deep networks for complex dynamics forecasting,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 12, p. 124012, 2021.
  56. N. Takeishi and A. Kalousis, “Physics-integrated variational autoencoders for robust and interpretable generative modeling,” Advances in Neural Information Processing Systems, vol. 34, pp. 14 809–14 821, 2021.
  57. Z. Qian, W. Zame, L. Fleuren, P. Elbers, and M. van der Schaar, “Integrating expert odes into neural odes: pharmacology and disease progression,” Advances in Neural Information Processing Systems, vol. 34, pp. 11 364–11 383, 2021.
  58. A. Wehenkel, J. Behrmann, H. Hsu, G. Sapiro, G. Louppe, and J.-H. Jacobsen, “Robust hybrid learning with expert augmentation,” Transactions on Machine Learning Research, 2023. [Online]. Available: https://openreview.net/forum?id=oe4dl4MCGY
  59. C. Geng, H. Paganetti, and C. Grassberger, “Prediction of Treatment Response for Combined Chemo- and Radiation Therapy for Non-Small Cell Lung Cancer Patients Using a Bio-Mathematical Model,” Scientific Reports, vol. 7, no. 1, p. 13542, Oct. 2017.
  60. I. Bica, A. M. Alaa, J. Jordon, and M. van der Schaar, “Estimating counterfactual treatment outcomes over time through adversarially balanced representations,” in International Conference on Learning Representations, 2020.
  61. N. Seedat, F. Imrie, A. Bellot, Z. Qian, and M. van der Schaar, “Continuous-time modeling of counterfactual outcomes using neural controlled differential equations,” arXiv preprint arXiv:2206.08311, 2022.
  62. V. Melnychuk, D. Frauen, and S. Feuerriegel, “Causal transformer for estimating counterfactual outcomes,” in International Conference on Machine Learning.   PMLR, 2022, pp. 15 293–15 329.
  63. C. C. Kerr, R. M. Stuart, D. Mistry, R. G. Abeysuriya, K. Rosenfeld, G. R. Hart, R. C. Núñez, J. A. Cohen, P. Selvaraj, B. Hagedorn et al., “Covasim: an agent-based model of covid-19 dynamics and interventions,” PLOS Computational Biology, vol. 17, no. 7, p. e1009149, 2021.
  64. T. Hiltunen, L. Jones, S. Ellner, and N. G. Hairston Jr, “Temporal dynamics of a simple community with intraguild predation: an experimental test,” Ecology, vol. 94, no. 4, pp. 773–779, 2013.
  65. E. P. Odum and G. W. Barrett, “Fundamentals of ecology,” The Journal of Wildlife Management, vol. 36, no. 4, p. 1372, 1972.
  66. V. M. M. Alvarez, R. Roşca, and C. G. Fălcuţescu, “Dynode: Neural ordinary differential equations for dynamics modeling in continuous control,” arXiv preprint arXiv:2009.04278, 2020.
  67. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” nature, vol. 323, no. 6088, pp. 533–536, 1986.
  68. S. Holt, A. Hüyük, and M. van der Schaar, “Active observing in continuous-time control,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  69. S. Holt, Z. Qian, and M. van der Schaar, “Deep generative symbolic regression,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=o7koEEMA1bR
  70. K. Kacprzyk, S. Holt, J. Berrevoets, Z. Qian, and M. van der Schaar, “ODE discovery for longitudinal heterogeneous treatment effects inference,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=pxI5IPeWgW
  71. T. Liu, N. Astorga, N. Seedat, and M. van der Schaar, “Large language models to enhance bayesian optimization,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=OOxotBmGol
  72. C. Yang, X. Wang, Y. Lu, H. Liu, Q. V. Le, D. Zhou, and X. Chen, “Large language models as optimizers,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=Bb4VGOWELI
  73. M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. D. O. Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman et al., “Evaluating large language models trained on code,” arXiv preprint arXiv:2107.03374, 2021.
  74. S. Holt, M. R. Luyten, and M. van der Schaar, “L2MAC: Large language model automatic computer for extensive code generation,” in The Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=EhrzQwsV4K
  75. S. Holt, Z. Qian, T. Liu, J. Weatherall, and M. van der Schaar, “Data-driven discovery of dynamical systems in pharmacology using large language models,” in The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024.
  76. N. Astorga, T. Liu, N. Seedat, and M. van der Schaar, “Partially observable cost-aware active-learning with large language models,” in The Thirty-Eighth Annual Conference on Neural Information Processing Systems, 2024.
  77. W. Bonnaffé and T. Coulson, “Fast fitting of neural ordinary differential equations by bayesian neural gradient matching to infer ecological interactions from time-series data,” Methods in Ecology and Evolution, vol. 14, no. 6, pp. 1543–1563, 2023.
  78. S. K. Kumar, “On weight initialization in deep neural networks,” arXiv preprint arXiv:1704.08863, 2017.
  79. A. Graves, S. Fernández, and J. Schmidhuber, “Multi-dimensional recurrent neural networks,” in International conference on artificial neural networks.   Springer, 2007, pp. 549–558.
  80. B. K. Petersen, M. L. Larma, T. N. Mundhenk, C. P. Santiago, S. K. Kim, and J. T. Kim, “Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients,” in International Conference on Learning Representations, 2020.
  81. S. Hsiang, D. Allen, S. Annan-Phan, K. Bell, I. Bolliger, T. Chong, H. Druckenmiller, L. Y. Huang, A. Hultgren, E. Krasovich et al., “The effect of large-scale anti-contagion policies on the covid-19 pandemic,” Nature, vol. 584, no. 7820, pp. 262–267, 2020.
  82. O. N. Bjørnstad, K. Shea, M. Krzywinski, and N. Altman, “The seirs model for infectious disease dynamics.” Nature methods, vol. 17, no. 6, pp. 557–559, 2020.
Citations (1)

Summary

  • The paper introduces a novel hybrid modeling framework that merges mechanistic models and neural networks for enhanced generalization in Digital Twins.
  • It presents HDTwinGen, which uses evolutionary algorithms and LLMs to automate the design and optimization of hybrid digital twin models.
  • Empirical tests demonstrate improved out-of-distribution generalization, sample efficiency, and modular evolvability for dynamic system modeling.

Overview of "Automatically Learning Hybrid Digital Twins of Dynamical Systems"

The paper "Automatically Learning Hybrid Digital Twins of Dynamical Systems" introduces a novel approach to enhance the modeling of dynamical systems through the creation of Hybrid Digital Twins (HDTwins). Digital Twins (DTs), which simulate real-world systems, are often limited by their capacity to generalize to unseen conditions, especially in data-scarce environments. This research addresses these challenges by proposing a hybrid architecture that combines mechanistic models with neural network components.

The authors present HDTwins as a potent alternative to traditional DT models, which typically rely either on purely mechanistic approaches or purely neural network-based models. Their work is motivated by the need for DTs that can offer robust out-of-distribution generalization, sample-efficient learning, and the ability to evolve with minimal retraining. To this end, the authors propose an innovative method, HDTwinGen, leveraging evolutionary algorithms and LLMs to automate the design and optimization of HDTwins.

Key Contributions

  1. Hybrid Modeling Framework: The paper introduces a hybrid architecture for DTs that combines mechanistic models with neural components. This hybrid approach utilizes domain-grounded priors through mechanistic models and the expressiveness of neural networks to achieve enhanced generalization and regularization, particularly valuable in data-scarce conditions.
  2. HDTwinGen for Automated Design: The proposed HDTwinGen framework uses LLMs to automatically specify and optimize hybrid models. This process involves an evolutionary algorithm where LLMs propose model structures, and optimization tools then refine parameters based on empirical data. This method marks a shift from reliance on expert-specified models to automated hybrid model design, paving the way for efficient and scalable DT development.
  3. Empirical Validation: Through empirical tests, HDTwinGen was shown to produce DT models with better out-of-distribution generalization, improved sample efficiency, and greater flexibility for modular evolvability. These results underscore the potential of automated hybrid modeling to advance the state-of-the-art in DT applications.

Implications and Future Directions

The implications of this research are significant for the field of AI and modeling of dynamical systems. By automating the creation of DTs, especially in settings where data is limited, HDTwins offer a pathway towards more adaptive and reliable models. This approach may significantly reduce the time and expertise required to develop effective models, fostering broader applicability across various domains such as healthcare, engineering, and environmental sciences.

Looking forward, the deployment of HDTwinGen in real-world scenarios could spur further enhancements in DT frameworks. Future research may explore the integration of additional domain-specific constraints or the incorporation of more sophisticated neural components. Moreover, as LLM capabilities advance, their role in hybrid model optimization is likely to grow, enabling even more intelligent and context-aware model designs.

In conclusion, this paper provides a compelling advancement in the domain of Digital Twins through innovative hybrid modeling, automated design, and optimization using LLMs. It sets a foundation for future research to further refine and adopt these technologies across a spectrum of complex, data-scarce environments.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com