Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model (2001.01520v2)

Published 6 Jan 2020 in stat.ML, cs.LG, and physics.ao-ph

Abstract: A novel method, based on the combination of data assimilation and machine learning is introduced. The new hybrid approach is designed for a two-fold scope: (i) emulating hidden, possibly chaotic, dynamics and (ii) predicting their future states. The method consists in applying iteratively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to optimally combine a surrogate model with sparse noisy data. The output analysis is spatially complete and is used as a training set by the neural network to update the surrogate model. The two steps are then repeated iteratively. Numerical experiments have been carried out using the chaotic 40-variables Lorenz 96 model, proving both convergence and statistical skill of the proposed hybrid approach. The surrogate model shows short-term forecast skill up to two Lyapunov times, the retrieval of positive Lyapunov exponents as well as the more energetic frequencies of the power density spectrum. The sensitivity of the method to critical setup parameters is also presented: the forecast skill decreases smoothly with increased observational noise but drops abruptly if less than half of the model domain is observed. The successful synergy between data assimilation and machine learning, proven here with a low-dimensional system, encourages further investigation of such hybrids with more sophisticated dynamics.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Julien Brajard (13 papers)
  2. Alberto Carassi (1 paper)
  3. Marc Bocquet (46 papers)
  4. Laurent Bertino (12 papers)
Citations (205)

Summary

Combining Data Assimilation and Machine Learning for Emulating Dynamical Models

This essay explores the academic paper titled "Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model" by J. Brajard et al., which presents a novel hybrid method that leverages both data assimilation (DA) and ML to emulate dynamical systems based on partial and noisy observations. The paper is a confluence of DA, often used to integrate data with numerical models, and ML, particularly neural networks, which have shown potential for modeling complex systems.

The stated objective of the hybrid method is two-fold: to emulate hidden, potentially chaotic dynamics, and to predict future states. The authors employ the ensemble Kalman filter in the DA step to combine a surrogate model with sparse and noisy data, generating a complete spatial analysis for training the neural network. The iterative process between DA and ML is crucial to refining the surrogate model over time.

One of the most significant advancements demonstrated in the paper is the application of this combined technique on the chaotic 40-variables Lorenz 96 model, revealing the method’s potential with low-dimensional systems. The convergence and statistical skill achieved by the hybrid approach suggest a promising pathway towards improved modeling techniques. The surrogate model showcases short-term forecast abilities up to two Lyapunov times and effectively retrieves positive Lyapunov exponents alongside the energetic frequencies of the power density spectrum. These capabilities indicate that the model not only acquires immediate predictive skills but also mimics the longer-term statistical properties of the system.

Another key result is the sensitivity analysis, which demonstrates the robustness of the proposed method in the face of setup variations. Forecast skill degrades smoothly as observational noise increases; however, it drops sharply when less than half of the model domain is observed. This emphasizes the importance of observation density and noise considerations in practical implementations.

In terms of practical and theoretical implications, this research signifies a step towards more accurate dynamical models, capable of making reliable predictions even with imperfect data. The interfacing of DA with ML allows for flexibility and extends the data assimilation framework beyond traditional boundaries. The transferable nature of the hybrid methodology suggests possible applications in various scientific domains, including atmospheric, oceanographic, and earth sciences, where data sparsity and noise are common challenges.

Future developments could investigate the extension of this methodology to higher-dimensional and more sophisticated systems. Additionally, enhancing computational efficiency and scalability could facilitate the application of this approach in operational settings. The possibility of integrating stochastic elements and optimizing hyperparameters might lead to even more refined surrogate models.

In conclusion, this research contributes significant insights into the synthesis of data assimilation and machine learning. By aligning the strengths of both fields, it paves the way for more robust and adaptable models capable of handling complex, real-world systems. This paper not only demonstrates the current capabilities of such hybrid approaches but also lays the groundwork for future exploration and innovation within computational modeling frameworks.