DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF (2404.00874v1)
Abstract: We present DiSR-NeRF, a diffusion-guided framework for view-consistent super-resolution (SR) NeRF. Unlike prior works, we circumvent the requirement for high-resolution (HR) reference images by leveraging existing powerful 2D super-resolution models. Nonetheless, independent SR 2D images are often inconsistent across different views. We thus propose Iterative 3D Synchronization (I3DS) to mitigate the inconsistency problem via the inherent multi-view consistency property of NeRF. Specifically, our I3DS alternates between upscaling low-resolution (LR) rendered images with diffusion models, and updating the underlying 3D representation with standard NeRF training. We further introduce Renoised Score Distillation (RSD), a novel score-distillation objective for 2D image resolution. Our RSD combines features from ancestral sampling and Score Distillation Sampling (SDS) to generate sharp images that are also LR-consistent. Qualitative and quantitative results on both synthetic and real-world datasets demonstrate that our DiSR-NeRF can achieve better results on NeRF super-resolution compared with existing works. Code and video results available at the project website.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. ICCV, 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. CVPR, 2022.
- Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. arXiv preprint arXiv:2103.15595, 2021.
- Tensorf: Tensorial radiance fields. arXiv preprint arXiv:2203.09517, 2022a.
- Fantasia3d: Disentangling geometry and appearance for high-quality text-to-3d content creation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Local-to-global registration for bundle-adjusting neural radiance fields. arXiv preprint arXiv:2211.11505, 2022b.
- Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
- Super-nerf: View-consistent detail generation for nerf super-resolution, 2023.
- Instruct-nerf2nerf: Editing 3d scenes with instructions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- Denoising diffusion probabilistic models, 2020.
- Cascaded diffusion models for high fidelity image generation. arXiv preprint arXiv:2106.15282, 2021.
- Learning to stylize novel views, 2021.
- Refsr-nerf: Towards high fidelity and super resolution view synthesis. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8244–8253, Los Alamitos, CA, USA, 2023a. IEEE Computer Society.
- Dreamtime: An improved optimization strategy for text-to-3d content creation, 2023b.
- Self-calibrating neural radiance fields. In ICCV, 2021.
- Srflow-da: Super-resolution using normalizing flow with deep convolutional block. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021.
- Nerf-vae: A geometry aware 3d scene generative model. https://arxiv.org/abs/2104.00587, 2021.
- Photo-realistic single image super-resolution using a generative adversarial network, 2017.
- Srdiff: Single image super-resolution with diffusion probabilistic models, 2021a.
- Nemi: Unifying neural radiance fields with multiplane images for novel view synthesis. CoRR, abs/2103.14910, 2021b.
- Hierarchical conditional flow: A unified framework for image super-resolution and image rescaling. In IEEE International Conference on Computer Vision, 2021.
- Barf: Bundle-adjusting neural radiance fields. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Magic3d: High-resolution text-to-3d content creation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Zero-1-to-3: Zero-shot one image to 3d object, 2023.
- Srflow: Learning the super-resolution space with normalizing flow, 2020.
- Latent-nerf for shape-guided generation of 3d shapes and textures. arXiv preprint arXiv:2211.07600, 2022.
- Switch-nerf: Learning scene decomposition with mixture of experts for large-scale neural radiance fields. In International Conference on Learning Representations (ICLR), 2023.
- Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics (TOG), 2019.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
- Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters, 20(3):209–212, 2013.
- Differentiable image parameterizations. Distill, 2018. https://distill.pub/2018/differentiable-parameterizations.
- Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint arXiv:2201.05989, 2022.
- Do deep generative models know what they don’t know?, 2019.
- Snerf: Stylized neural implicit representations for 3d scenes, 2022.
- Improved denoising diffusion probabilistic models, 2021.
- Dreamfusion: Text-to-3d using 2d diffusion. arXiv, 2022.
- High-resolution image synthesis with latent diffusion models, 2022.
- Image super-resolution via iterative refinement, 2021.
- Laion-5b: An open large-scale dataset for training next generation image-text models, 2022.
- Denoising diffusion implicit models, 2022.
- Generative modeling by estimating gradients of the data distribution, 2020.
- Score-based generative modeling through stochastic differential equations, 2021.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. CVPR, 2022.
- Block-NeRF: Scalable large scene neural view synthesis. arXiv, 2022.
- Nerf-sr: High-quality neural radiance fields using supersampling. arXiv, 2021a.
- Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation, 2022.
- Ibrnet: Learning multi-view image-based rendering. arXiv preprint arXiv:2102.13090, 2021b.
- Esrgan: Enhanced super-resolution generative adversarial networks, 2018.
- Real-esrgan: Training real-world blind super-resolution with pure synthetic data, 2021c.
- 4k-nerf: High fidelity neural radiance fields at ultra high resolutions, 2023a.
- Prolificdreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. arXiv preprint arXiv:2305.16213, 2023b.
- Diffir: Efficient diffusion model for image restoration, 2023.
- Sinnerf: Training neural radiance fields on complex scenes from a single image. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXII, pages 736–753. Springer, 2022a.
- Point-nerf: Point-based neural radiance fields. arXiv preprint arXiv:2201.08845, 2022b.
- Local implicit normalizing flow for arbitrary-scale image super-resolution, 2023.
- pixelNeRF: Neural radiance fields from one or few images. https://arxiv.org/abs/2012.02190, 2020.
- The unreasonable effectiveness of deep features as a perceptual metric. In CVPR, 2018.
- Hifa: High-fidelity text-to-3d with advanced diffusion guidance, 2023.