ColNeRF: Collaboration for Generalizable Sparse Input Neural Radiance Field (2312.09095v2)
Abstract: Neural Radiance Fields (NeRF) have demonstrated impressive potential in synthesizing novel views from dense input, however, their effectiveness is challenged when dealing with sparse input. Existing approaches that incorporate additional depth or semantic supervision can alleviate this issue to an extent. However, the process of supervision collection is not only costly but also potentially inaccurate, leading to poor performance and generalization ability in diverse scenarios. In our work, we introduce a novel model: the Collaborative Neural Radiance Fields (ColNeRF) designed to work with sparse input. The collaboration in ColNeRF includes both the cooperation between sparse input images and the cooperation between the output of the neural radiation field. Through this, we construct a novel collaborative module that aligns information from various views and meanwhile imposes self-supervised constraints to ensure multi-view consistency in both geometry and appearance. A Collaborative Cross-View Volume Integration module (CCVI) is proposed to capture complex occlusions and implicitly infer the spatial location of objects. Moreover, we introduce self-supervision of target rays projected in multiple directions to ensure geometric and color consistency in adjacent regions. Benefiting from the collaboration at the input and output ends, ColNeRF is capable of capturing richer and more generalized scene representation, thereby facilitating higher-quality results of the novel view synthesis. Extensive experiments demonstrate that ColNeRF outperforms state-of-the-art sparse input generalizable NeRF methods. Furthermore, our approach exhibits superiority in fine-tuning towards adapting to new scenes, achieving competitive performance compared to per-scene optimized NeRF-based methods while significantly reducing computational costs. Our code is available at: https://github.com/eezkni/ColNeRF.
- MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 14124–14133.
- Bidirectional Optical Flow NeRF: High Accuracy and High Quality under Fewer Views. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 359–368.
- Explicit Correspondence Matching for Generalizable Neural Radiance Fields. arXiv preprint arXiv:2304.12294.
- Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7911–7920.
- Voxel R-CNN: Towards High Performance Voxel-Based 3D Object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 1201–1209.
- Depth-Supervised NeRF: Fewer Views and Faster Training for Free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12882–12891.
- One is All: Bridging the Gap between Neural Radiance Fields Architectures with Progressive Volume Distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 597–605.
- Photonic Lanterns, 3-D Waveguides, Multiplane Light Conversion, and Other Components that Enable Space-Division Multiplexing. Proceedings of the IEEE, 110(11): 1821–1834.
- Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
- 3D Volumetric Modeling with Introspective Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 8481–8488.
- Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5885–5894.
- Large Scale Multi-View Stereopsis Evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 406–413.
- InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12912–12921.
- MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 12578–12588.
- Neural Volumes: Learning Dynamic Renderable Volumes from Images. ACM Transactions on Graphics, 38(4).
- VoxNet: A 3D Convolutional Neural Network for Real-Time Object recognition. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 922–928.
- Local Light Field Fusion: Practical View Synthesis With Prescriptive Sampling Guidelines. ACM Transactions on Graphics, 38(4): 1–14.
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Communications of the ACM, 65(1): 99–106.
- Towards Unsupervised Deep Image Enhancement With Generative Adversarial Network. IEEE Transactions on Image Processing, 29: 9140–9151.
- Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network. In Proceedings of the 28th ACM International Conference on Multimedia, 1697–1705.
- RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5480–5490.
- 3D Photography Using Context-Aware Layered Depth Inpainting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8028–8038.
- SimpleNeRF: Regularizing Sparse Input Neural Radiance Fields with Simpler Solutions. arXiv preprint arXiv:2309.03955.
- ViP-NeRF: Visibility Prior for Sparse Input Neural Radiance Fields. In Proceedings of the ACM Special Interest Group on Computer Graphics and Interactive Techniques.
- Direct Voxel Grid Optimization: Super-Fast Convergence for Radiance Fields Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5459–5469.
- Layer-Structured 3D Scene Inference via View Synthesis. In Proceedings of the European Conference on Computer Vision, 302–317.
- Sparsenerf: Distilling depth ranking for few-shot novel view synthesis. arXiv preprint arXiv:2303.16196.
- IBRNet: Learning Multi-View Image-Based Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4690–4699.
- Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13(4): 600–612.
- Behind the Curtain: Learning Occluded Shapes for 3D Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2893–2901.
- FreeNeRF: Improving Few-Shot Neural Rendering with Free Frequency Regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8254–8263.
- PixelNeRF: Neural Radiance Fields from One or Few Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4578–4587.
- The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern recognition, 586–595.
- Learning Adversarial 3D Model Generation with 2D Image Enhancer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 7615–7622.