Fourier analysis of the physics of transfer learning for data-driven subgrid-scale models of ocean turbulence (2504.15487v1)

Published 21 Apr 2025 in cs.LG, nlin.CD, physics.ao-ph, and physics.geo-ph

Abstract: Transfer learning (TL) is a powerful tool for enhancing the performance of neural networks (NNs) in applications such as weather and climate prediction and turbulence modeling. TL enables models to generalize to out-of-distribution data with minimal training data from the new system. In this study, we employ a 9-layer convolutional NN to predict the subgrid forcing in a two-layer ocean quasi-geostrophic system and examine which metrics best describe its performance and generalizability to unseen dynamical regimes. Fourier analysis of the NN kernels reveals that they learn low-pass, Gabor, and high-pass filters, regardless of whether the training data are isotropic or anisotropic. By analyzing the activation spectra, we identify why NNs fail to generalize without TL and how TL can overcome these limitations: the learned weights and biases from one dataset underestimate the out-of-distribution sample spectra as they pass through the network, leading to an underestimation of output spectra. By re-training only one layer with data from the target system, this underestimation is corrected, enabling the NN to produce predictions that match the target spectra. These findings are broadly applicable to data-driven parameterization of dynamical systems.

Summary

Fourier Analysis of Transfer Learning in Subgrid-Scale Modeling of Ocean Turbulence

The paper titled "Fourier analysis of the physics of transfer learning for data-driven subgrid-scale models of ocean turbulence" provides a comprehensive examination of the use of transfer learning (TL) in enhancing the generalization capabilities of neural networks (NNs) for subgrid-scale (SGS) models in simulating ocean turbulence. The research primarily focuses on the application of a nine-layer convolutional neural network (CNN) to predict subgrid forcing within a two-layer ocean quasi-geostrophic model, analyzing how these networks adapt across different dynamical regimes via TL.

Key Findings and Contributions

The primary contribution of the paper is elucidating how TL enhances the generalization of NNs in modeling SGS processes, which are notoriously difficult to resolve due to their nonlinear, multi-scale nature. The research thoroughly analyzes the spectral properties of NN kernels, identifying that they inherently learn low-pass, Gabor, and high-pass filters. An intriguing aspect of the paper is its focus on the spectral analysis of NN activations, offering insights into why NNs struggle to generalize without TL, and how TL rectifies these shortcomings. The research establishes that without TL, the spectral response of activations falls short of representing the true dynamics, leading to discrepancies in model predictions.

The paper rigorously tests the generalization capability of base neural networks (BNNs) and highlights cases where they falter when applied to out-of-distribution data. TL is demonstrated to significantly bridge this gap with minimal additional data by re-training a single layer of the network, enabling predictions that align closely with the target system's spectral characteristics. This spectral alignment is pivotal, as it ensures that even a network exposed to different training datasets can effectively adapt to new, diverse dynamical regimes without extensively retraining.

Spectral Analysis and Kernel Adaptation

Using Fourier transform analysis, the authors dissect the kernel weights, showing how TL adeptly modifies them to conform to the physics of the target system. Notably, the paper finds that TL realigns the distribution of kernel weights, which are identified as spectral filters, to improve the capturing of both low- and high-wavenumber information essential for accurate turbulence modeling. The ability of TL to modify these filters contributes to a network's ability to generalize across different flow configurations, thereby expanding its applicability beyond its initial training regime.

Practical Implications and Future Directions

The implications of this research are significant for the broader field of data-driven modeling in the geosciences, particularly in climate and ocean dynamics. The capability to reliably generalize across varying conditions without extensive retraining or data requirements could considerably enhance the predictive power of ocean and climate models, leading to more accurate simulations that require lower computational resources. This advancement is crucial in the context of increasing demand for high-resolution models constrained by computational limits.

Theoretically, the paper contributes to a deeper understanding of the mechanics of TL in neural networks, offering a physics-informed perspective that can inform future developments in machine learning applications in Earth systems science. It raises pertinent questions about the optimal initialization and training strategies for data-driven models, suggesting pathways for incorporating physical insights into model development.

Future research directions might include extending the spectral analysis framework to larger, more complex systems, including three-dimensional ocean models or general circulation models. Additionally, investigating the role of TL in other domains of physics-informed machine learning applications could provide further insights into universal strategies for enhancing model generalization and efficiency.

In summary, the paper provides a detailed investigation into the spectral dynamics of transfer learning within the context of ocean subgrid-scale turbulence modeling. Its insights into the interplay between physics-based understanding and machine learning highlight a promising avenue for advancing data-driven modeling methodologies in geophysical sciences.

Tweets

https://twitter.com/SciencePapers/status/1914953226572493062