Papers
Topics
Authors
Recent
Search
2000 character limit reached

Streamlining Ocean Dynamics Modeling with Fourier Neural Operators: A Multiobjective Hyperparameter and Architecture Optimization Approach

Published 7 Apr 2024 in cs.LG, physics.ao-ph, and stat.ML | (2404.05768v2)

Abstract: Training an effective deep learning model to learn ocean processes involves careful choices of various hyperparameters. We leverage the advanced search algorithms for multiobjective optimization in DeepHyper, a scalable hyperparameter optimization software, to streamline the development of neural networks tailored for ocean modeling. The focus is on optimizing Fourier neural operators (FNOs), a data-driven model capable of simulating complex ocean behaviors. Selecting the correct model and tuning the hyperparameters are challenging tasks, requiring much effort to ensure model accuracy. DeepHyper allows efficient exploration of hyperparameters associated with data preprocessing, FNO architecture-related hyperparameters, and various model training strategies. We aim to obtain an optimal set of hyperparameters leading to the most performant model. Moreover, on top of the commonly used mean squared error for model training, we propose adopting the negative anomaly correlation coefficient as the additional loss term to improve model performance and investigate the potential trade-off between the two terms. The experimental results show that the optimal set of hyperparameters enhanced model performance in single timestepping forecasting and greatly exceeded the baseline configuration in the autoregressive rollout for long-horizon forecasting up to 30 days. Utilizing DeepHyper, we demonstrate an approach to enhance the use of FNOs in ocean dynamics forecasting, offering a scalable solution with improved precision.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. T. Kurth, S. Treichler, J. Romero, M. Mudigonda, N. Luehr, E. Phillips, A. Mahesh, M. Matheson, J. Deslippe, M. Fatica, et al., “Exascale deep learning for climate analytics,” in SC18: International conference for high performance computing, networking, storage and analysis, pp. 649–660, IEEE, 2018.
  2. S. Rasp, M. S. Pritchard, and P. Gentine, “Deep learning to represent subgrid processes in climate models,” Proceedings of the National Academy of Sciences, vol. 115, no. 39, pp. 9684–9689, 2018.
  3. T. Nguyen, J. Brandstetter, A. Kapoor, J. K. Gupta, and A. Grover, “Climax: A foundation model for weather and climate,” arXiv preprint arXiv:2301.10343, 2023.
  4. P. B. Gibson, W. E. Chapman, A. Altinok, L. Delle Monache, M. J. DeFlorio, and D. E. Waliser, “Training machine learning models on climate model output yields skillful interpretable seasonal precipitation forecasts,” Communications Earth & Environment, vol. 2, no. 1, p. 159, 2021.
  5. J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, et al., “FourCastNet: A global data-driven high-resolution weather model using adaptive fourier neural operators,” arXiv preprint arXiv:2202.11214, 2022.
  6. L. Cheng, K. E. Trenberth, J. Fasullo, T. Boyer, J. Abraham, and J. Zhu, “Improved estimates of ocean heat content from 1960 to 2015,” Science Advances, vol. 3, no. 3, p. e1601545, 2017.
  7. Y. Gou, T. Zhang, J. Liu, L. Wei, and J.-H. Cui, “DeepOcean: A general deep learning framework for spatio-temporal ocean sensing data prediction,” IEEE Access, vol. 8, pp. 79192–79202, 2020.
  8. Y. Choi, Y. Park, J. Hwang, K. Jeong, and E. Kim, “Improving ocean forecasting using deep learning and numerical model integration,” Journal of Marine Science and Engineering, vol. 10, no. 4, p. 450, 2022.
  9. S. Partee, M. Ellis, A. Rigazzi, A. E. Shao, S. Bachman, G. Marques, and B. Robbins, “Using machine learning at scale in numerical simulations with smartsim: An application to ocean climate modeling,” Journal of Computational Science, vol. 62, p. 101707, 2022.
  10. Y. Zhu, R.-H. Zhang, J. N. Moum, F. Wang, X. Li, and D. Li, “Physics-informed deep-learning parameterization of ocean vertical mixing improves climate simulations,” National Science Review, vol. 9, no. 8, p. nwac044, 2022.
  11. A. P. Guillaumin and L. Zanna, “Stochastic-deep learning parameterization of ocean momentum forcing,” Journal of Advances in Modeling Earth Systems, vol. 13, no. 9, p. e2021MS002534, 2021.
  12. L. Zanna and T. Bolton, “Data-driven equation discovery of ocean mesoscale closures,” Geophysical Research Letters, vol. 47, no. 17, p. e2020GL088376, 2020.
  13. L. Liao, H. Li, W. Shang, and L. Ma, “An empirical study of the impact of hyperparameter tuning and model optimization on the performance properties of deep neural networks,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 31, no. 3, pp. 1–40, 2022.
  14. K. Bi, L. Xie, H. Zhang, X. Chen, X. Gu, and Q. Tian, “Pangu-Weather: A 3D high-resolution model for fast and accurate global weather forecast,” 2022.
  15. A. Mustafa, A. Mikhailiuk, D. A. Iliescu, V. Babbar, and R. K. Mantiuk, “Training a task-specific image reconstruction loss,” 2021.
  16. H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 47–57, 2016.
  17. J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pp. 694–711, Springer, 2016.
  18. A. H. Murphy and E. S. Epstein, “Skill scores and correlation coefficients in model verification,” Monthly Weather Review, vol. 117, no. 3, pp. 572–582, 1989.
  19. Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,” arXiv preprint arXiv:2010.08895, 2020.
  20. P. Balaprakash, M. Salim, T. D. Uram, V. Vishwanath, and S. M. Wild, “DeepHyper: Asynchronous hyperparameter search for deep neural networks,” in 2018 IEEE 25th international conference on high performance computing (HiPC), pp. 42–51, IEEE, 2018.
  21. P. Balaprakash, R. Egele, M. Salim, R. Maulik, V. Vishwanath, S. Wild, et al., “”deephyper: A python package for scalable neural architecture and hyperparameter search”,” 2018.
  22. R. Lam, A. Sanchez-Gonzalez, M. Willson, P. Wirnsberger, M. Fortunato, F. Alet, S. Ravuri, T. Ewalds, Z. Eaton-Rosen, W. Hu, et al., “Learning skillful medium-range global weather forecasting,” Science, vol. 382, no. 6677, pp. 1416–1421, 2023.
  23. J. Guibas, M. Mardani, Z. Li, A. Tao, A. Anandkumar, and B. Catanzaro, “Adaptive Fourier neural operators: Efficient token mixers for transformers,” arXiv preprint arXiv:2111.13587, 2021.
  24. Z. Li, H. Zheng, N. Kovachki, D. Jin, H. Chen, B. Liu, K. Azizzadenesheli, and A. Anandkumar, “Physics-informed neural operator for learning partial differential equations,” 2023.
  25. V. Fanaskov and I. Oseledets, “Spectral neural operators,” 2022.
  26. T. J. Grady, R. Khan, M. Louboutin, Z. Yin, P. A. Witte, R. Chandra, R. J. Hewett, and F. Herrmann, “Towards large-scale learned solvers for parametric PDEs with model-parallel Fourier neural operators,” ArXiv, vol. abs/2204.01205, 2022.
  27. T. Zhang, D. O. Trad, and K. A. Innanen, “Learning to solve the elastic wave equation with Fourier neural operators,” GEOPHYSICS, 2023.
  28. S. Bire, B. Lütjens, K. Azizzadenesheli, A. Anandkumar, and C. N. Hill, “Ocean emulation with Fourier neural operators: Double gyre,” Authorea Preprints, 2023.
  29. F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm configuration,” in Learning and Intelligent Optimization, 2011.
  30. D. R. Jones, M. Schonlau, and W. J. Welch, “Efficient global optimization of expensive black-box functions,” Journal of Global Optimization, vol. 13, pp. 455–492, 1998.
  31. R. Egelé, I. Guyon, V. Vishwanath, and P. Balaprakash, “Asynchronous decentralized Bayesian optimization for large scale hyperparameter optimization,” in 2023 IEEE 19th International Conference on e-Science (e-Science), pp. 1–10, IEEE, 2023.
  32. J. T. Wilson, R. Moriconi, F. Hutter, and M. P. Deisenroth, “The reparameterization trick for acquisition functions,” arXiv preprint arXiv:1712.00424, 2017.
  33. J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian optimization of machine learning algorithms,” Advances in Neural Information Processing Systems, vol. 25, 2012.
  34. D. Ginsbourger, R. L. Riche, and L. Carraro, “Kriging is well-suited to parallelize optimization,” in Computational intelligence in expensive optimization problems, pp. 131–162, Springer, 2010.
  35. P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,” Machine Learning, vol. 63, pp. 3–42, 2006.
  36. L. Breiman, “Random forests,” Machine Learning, vol. 45, pp. 5–32, 2001.
  37. P. Kadlec and Z. Raida, “Multi-objective self-organizing migrating algorithm,” Self-Organizing Migrating Algorithm: Methodology and Implementation, pp. 83–103, 2016.
  38. Springer Science & Business Media, 2005.
  39. R. Égelé, T. Chang, Y. Sun, V. Vishwanath, and P. Balaprakash, “Parallel multi-objective hyperparameter optimization with uniform normalization and bounded objectives,” arXiv preprint arXiv:2309.14936, 2023.
  40. N. Zrira, A. Kamal-Idrissi, R. Farssi, and H. A. Khan, “Time series prediction of sea surface temperature based on BiLSTM model with attention mechanism,” Journal of Sea Research, p. 102472, 2024.
  41. A. Colin, P. Tandeo, C. Peureux, R. Husson, N. Longépé, and R. Fablet, “Rain regime segmentation of sentinel-1 observation learning from NEXRAD collocations with convolution neural networks,” IEEE Transactions on Geoscience and Remote Sensing, 2024.
  42. Y. Sun, E. Cucuzzella, S. Brus, S. H. K. Narayanan, B. Nadiga, L. Van Roekel, J. Hückelheim, and S. Madireddy, “Surrogate neural networks to estimate parametric sensitivity of ocean models,” arXiv preprint arXiv:2311.08421, 2023.
  43. P. J. Wolfram, T. D. Ringler, M. E. Maltrud, D. W. Jacobsen, and M. R. Petersen, “Diagnosing isopycnal diffusivity in an eddying, idealized midlatitude ocean basin via lagrangian, in situ, global, high-performance particle tracking (LIGHT),” Journal of Physical Oceanography, vol. 45, no. 8, pp. 2114–2133, 2015.
  44. F. Alrasheedi, X. Zhong, and P.-C. Huang, “Padding module: Learning the padding in deep neural networks,” IEEE Access, vol. 11, pp. 7348–7357, 2023.
  45. Y. Bengio, “Practical recommendations for gradient-based training of deep architectures,” 2012.
  46. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 0 likes about this paper.