Thera: Aliasing-Free Arbitrary-Scale Super-Resolution with Neural Heat Fields (2311.17643v3)
Abstract: Recent approaches to arbitrary-scale single image super-resolution (ASR) use neural fields to represent continuous signals that can be sampled at arbitrary resolutions. However, point-wise queries of neural fields do not naturally match the point spread function (PSF) of pixels, which may cause aliasing in the super-resolved image. Existing methods attempt to mitigate this by approximating an integral version of the field at each scaling factor, compromising both fidelity and generalization. In this work, we introduce neural heat fields, a novel neural field formulation that inherently models a physically exact PSF. Our formulation enables analytically correct anti-aliasing at any desired output resolution, and -- unlike supersampling -- at no additional cost. Building on this foundation, we propose Thera, an end-to-end ASR method that substantially outperforms existing approaches, while being more parameter-efficient and offering strong theoretical guarantees. The project page is at https://therasr.github.io.
- NTIRE 2017 challenge on single image super-resolution: Dataset and study. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017.
- Layer normalization. arXiv preprint arXiv:1607.06450, 2016.
- Mip-NeRF: A multiscale representation for anti-aliasing neural radiance fields. In IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
- Mip-NeRF 360: Unbounded anti-aliased neural radiance fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
- Zip-NeRF: Anti-aliased grid-based neural radiance fields. arXiv preprint arXiv:2304.06706, 2023.
- Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In British Machine Vision Conference, 2012.
- JAX: composable transformations of Python+NumPy programs, 2018.
- CiaoSR: Continuous implicit attention-in-attention network for arbitrary-scale image super-resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1796–1807, 2023.
- JIFF: Jointly-aligned implicit face function for high quality single view clothed human reconstruction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2729–2739, 2022.
- Cascaded local implicit transformer for arbitrary-scale super-resolution. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18257–18267, 2023.
- Learning continuous image representation with local implicit image function. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8628–8638, 2021.
- Guided super-resolution as pixel-to-pixel transformation. In IEEE/CVF International Conference on Computer Vision, pages 8829–8837, 2019.
- Learning neural parametric head models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21003–21012, 2023.
- HyperNetworks. In International Conference on Learning Representations, 2017.
- Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
- Tri-MipRF: Tri-mip representation for efficient anti-aliasing neural radiance fields. In IEEE/CVF International Conference on Computer Vision, pages 19774–19783, 2023.
- Meta-SR: A magnification-arbitrary network for super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1575–1584, 2019.
- Single image super-resolution from transformed self-exemplars. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5197–5206, 2015.
- Neural LiDAR fields for novel view synthesis. In IEEE/CVF International Conference on Computer Vision, 2023.
- Alias-free generative adversarial networks. Advances in Neural Information Processing Systems, 34:852–863, 2021.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Local texture estimator for implicit representation function. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1929–1938, 2022.
- SwinIR: Image restoration using Swin transformer. In IEEE/CVF International Conference on Computer Vision, pages 1833–1844, 2021.
- Enhanced deep residual networks for single image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 136–144, 2017.
- A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.
- SGDR: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
- A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In IEEE International Conference on Computer Vision, pages 416–423. IEEE, 2001.
- Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, 76:21811–21838, 2017.
- Occupancy networks: Learning 3d reconstruction in function space. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.
- NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
- Signals and systems. Prentice hall Upper Saddle River, NJ, 1997.
- DeepSDF: Learning continuous signed distance functions for shape representation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 165–174, 2019.
- Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
- Convolutional occupancy networks. In European Conference on Computer Vision, pages 523–540. Springer, 2020.
- Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena, 60(1-4):259–268, 1992.
- MetaSDF: Meta-learning signed distance functions. Advances in Neural Information Processing Systems, 33:10136–10147, 2020a.
- Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems, 33:7462–7473, 2020b.
- Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
- Seven ways to improve example-based single image super resolution. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1865–1873, 2016.
- CUF: Continuous upsampling filters. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9999–10008, 2023.
- Neural fields as learnable kernels for 3d reconstruction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18500–18510, 2022.
- Neural fields in visual computing and beyond. In Computer Graphics Forum, pages 641–676. Wiley Online Library, 2022.
- ICON: Implicit clothed humans obtained from normals. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13286–13296. IEEE, 2022.
- UltraSR: Spatial encoding is a missing key for implicit image function-based arbitrary-scale super-resolution. arXiv preprint arXiv:2103.12716, 2021.
- i3dmm: Deep implicit 3d morphable model of human heads. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12803–12813, 2021.
- On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces, pages 711–730. Springer, 2012.
- Residual dense network for image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2472–2481, 2018.
- ImFace: A nonlinear 3d morphable face model with implicit neural representations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20343–20352, 2022.