Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution (2405.12202v1)
Abstract: In this work, we present an arbitrary-scale super-resolution (SR) method to enhance the resolution of scientific data, which often involves complex challenges such as continuity, multi-scale physics, and the intricacies of high-frequency signals. Grounded in operator learning, the proposed method is resolution-invariant. The core of our model is a hierarchical neural operator that leverages a Galerkin-type self-attention mechanism, enabling efficient learning of mappings between function spaces. Sinc filters are used to facilitate the information transfer across different levels in the hierarchy, thereby ensuring representation equivalence in the proposed neural operator. Additionally, we introduce a learnable prior structure that is derived from the spectral resizing of the input data. This loss prior is model-agnostic and is designed to dynamically adjust the weighting of pixel contributions, thereby balancing gradients effectively across the model. We conduct extensive experiments on diverse datasets from different domains and demonstrate consistent improvements compared to strong baselines, which consist of various state-of-the-art SR methods.
- Self-supervised material and texture representation learning for remote sensing tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8203–8215, 2022.
- Arakawa, A. Computational design for long-term numerical integration of the equations of fluid motion: Two-dimensional incompressible flow. part i. Journal of computational physics, 135(2):103–114, 1997.
- Brigham, E. O. The fast Fourier transform and its applications. Prentice-Hall, Inc., 1988.
- Ciaosr: Continuous implicit attention-in-attention network for arbitrary-scale image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1796–1807, 2023.
- Cao, S. Choose a transformer: Fourier or galerkin. Advances in neural information processing systems, 34:24924–24940, 2021.
- Super-resolution through neighbor embedding. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., volume 1, pp. I–I. IEEE, 2004.
- Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8628–8638, 2021.
- Rethinking attention with performers. In International Conference on Learning Representations, 2020.
- Learning a deep convolutional network for image super-resolution. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV 13, pp. 184–199. Springer, 2014.
- Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2015.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=YicbFdNTTy.
- A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285, 2016.
- Super-resolution reconstruction of turbulent flows with machine learning. Journal of Fluid Mechanics, 870:106–120, 2019.
- Earthformer: Exploring space-time transformers for earth system forecasting. Advances in Neural Information Processing Systems, 35:25390–25403, 2022.
- Rethinking image super resolution from long-tailed distribution learning perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14327–14336, 2023.
- Gnot: A general neural operator transformer for operator learning. In International Conference on Machine Learning, pp. 12556–12569. PMLR, 2023.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- The era5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730):1999–2049, 2020.
- Holmes, P. Turbulence, coherent structures, dynamical systems and symmetry. Cambridge university press, 2012.
- Meta-sr: A magnification-arbitrary network for super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1575–1584, 2019.
- Robust compressed sensing mri with deep generative priors. Advances in Neural Information Processing Systems, 34:14938–14954, 2021.
- Focal frequency loss for image reconstruction and synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13919–13929, 2021.
- Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, 34:852–863, 2021.
- Keys, R. Cubic convolution interpolation for digital image processing. IEEE transactions on acoustics, speech, and signal processing, 29(6):1153–1160, 1981.
- Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1646–1654, 2016.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), San Diega, CA, USA, 2015.
- Neural operator: Learning maps between function spaces with applications to pdes. Journal of Machine Learning Research, 24(89):1–97, 2023.
- Local texture estimator for implicit representation function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1929–1938, 2022.
- Transformer-empowered multi-scale contextual matching and aggregation for multi-contrast mri super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20636–20645, 2022.
- Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=c8P9NQVtmnO.
- Scalable transformer for PDE surrogate modeling. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=djyn8Q0anK.
- Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 1833–1844, 2021.
- Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 136–144, 2017.
- Arbitrary-scale super-resolution via deep learning: A comprehensive survey. Information Fusion, pp. 102015, 2023.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021.
- Decoupled weight decay regularization. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=Bkg6RiCqY7.
- Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021.
- Reinstating continuous climate patterns from small and discretized data. In 1st Workshop on the Synergy of Scientific and Machine Learning Modeling@ ICML2023, 2023.
- Frame invariant neural network closures for kraichnan turbulence. Physica A: Statistical Mechanics and its Applications, 609:128327, 2023.
- On the spectral bias of neural networks. In International Conference on Machine Learning, pp. 5301–5310. PMLR, 2019.
- Convolutional neural operators for robust and accurate learning of PDEs. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=MtekhXRP4h.
- Deep learning and process understanding for data-driven earth system science. Nature, 566(7743):195–204, 2019.
- Superbench: A super-resolution benchmark dataset for scientific machine learning. arXiv preprint arXiv:2306.14070, 2023.
- Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1874–1883, 2016.
- Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020.
- Ope-sr: Orthogonal position encoding for designing a parameter-free upsampling module in arbitrary-scale image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10009–10020, 2023.
- Adversarial super-resolution of climatological wind and solar data. Proceedings of the National Academy of Sciences, 117(29):16805–16815, 2020.
- Anchored neighborhood regression for fast example-based super-resolution. In Proceedings of the IEEE international conference on computer vision, pp. 1920–1927, 2013.
- Factorized fourier neural operators. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=tmIiMPl4IPa.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Sevir: A storm event imagery dataset for deep learning applications in radar and satellite meteorology. Advances in Neural Information Processing Systems, 33:22009–22019, 2020.
- Esrgan: Enhanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0–0, 2018.
- Super-resolution neural operator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18247–18256, 2023.
- tempogan: A temporally coherent, volumetric gan for super-resolution fluid flow. ACM Transactions on Graphics (TOG), 37(4):1–15, 2018.
- Neural fields in visual computing and beyond. In Computer Graphics Forum, volume 41, pp. 641–676. Wiley Online Library, 2022.
- Image super-resolution via sparse representation. IEEE transactions on image processing, 19(11):2861–2873, 2010.
- Fourier neural operators for arbitrary resolution climate data downscaling. In ICLR 2023 Workshop on Tackling Climate Change with Machine Learning, 2023. URL https://www.climatechange.ai/papers/iclr2023/59.
- Local implicit normalizing flow for arbitrary-scale image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1776–1785, 2023.
- Deep fourier up-sampling. Advances in Neural Information Processing Systems, 35:22995–23008, 2022.
- Wide activation for efficient and accurate image super-resolution. arXiv preprint arXiv:1808.08718, 2018.
- Residual dense network for image super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2472–2481, 2018.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pp. 11106–11115, 2021.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.