Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OR-NeRF: Object Removing from 3D Scenes Guided by Multiview Segmentation with Neural Radiance Fields (2305.10503v3)

Published 17 May 2023 in cs.CV

Abstract: The emergence of Neural Radiance Fields (NeRF) for novel view synthesis has increased interest in 3D scene editing. An essential task in editing is removing objects from a scene while ensuring visual reasonability and multiview consistency. However, current methods face challenges such as time-consuming object labeling, limited capability to remove specific targets, and compromised rendering quality after removal. This paper proposes a novel object-removing pipeline, named OR-NeRF, that can remove objects from 3D scenes with user-given points or text prompts on a single view, achieving better performance in less time than previous works. Our method spreads user annotations to all views through 3D geometry and sparse correspondence, ensuring 3D consistency with less processing burden. Then recent 2D segmentation model Segment-Anything (SAM) is applied to predict masks, and a 2D inpainting model is used to generate color supervision. Finally, our algorithm applies depth supervision and perceptual loss to maintain consistency in geometry and appearance after object removal. Experimental results demonstrate that our method achieves better editing quality with less time than previous works, considering both quality and quantity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis,” in Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I, 2020, pp. 405–421. [Online]. Available: https://doi.org/10.1007/978-3-030-58452-8_24
  2. Y. Peng, Y. Yan, S. Liu, Y. Cheng, S. Guan, B. Pan, G. Zhai, and X. Yang, “CageNeRF: Cage-based Neural Radiance Field for Generalized 3D Deformation and Animation,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35, 2022, pp. 31 402–31 415. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/cb78e6b5246b03e0b82b4acc8b11cc21-Paper-Conference.pdf
  3. T. Xu and T. Harada, “Deforming Radiance Fields with Cages,” in Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, 2022, pp. 159–175. [Online]. Available: https://doi.org/10.1007/978-3-031-19827-4_10
  4. Y.-J. Yuan, Y.-T. Sun, Y.-K. Lai, Y. Ma, R. Jia, and L. Gao, “NeRF-Editing: Geometry Editing of Neural Radiance Fields,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 18 332–18 343.
  5. F. Xiang, Z. Xu, M. Hašan, Y. Hold-Geoffroy, K. Sunkavalli, and H. Su, “NeuTex: Neural Texture Mapping for Volumetric Neural Rendering,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 7115–7124.
  6. B. Yang, C. Bao, J. Zeng, H. Bao, Y. Zhang, Z. Cui, and G. Zhang, “NeuMesh: Learning Disentangled Neural Mesh-Based Implicit Field for Geometry and Texture Editing,” in Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVI, 2022, pp. 597–614. [Online]. Available: https://doi.org/10.1007/978-3-031-19787-1_34
  7. B. Yang, Y. Zhang, Y. Xu, Y. Li, H. Zhou, H. Bao, G. Zhang, and Z. Cui, “Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 13 759–13 768.
  8. Q. Wu, X. Liu, Y. Chen, K. Li, C. Zheng, J. Cai, and J. Zheng, “Object-Compositional Neural Implicit Surfaces,” in Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVII, 2022, pp. 197–213. [Online]. Available: https://doi.org/10.1007/978-3-031-19812-0_12
  9. S. Weder, G. Garcia-Hernando, A. Monszpart, M. Pollefeys, G. Brostow, M. Firman, and S. Vicente, “Removing Objects From Neural Radiance Fields,” 2022. [Online]. Available: http://arxiv.org/abs/2212.11966
  10. A. Mirzaei, T. Aumentado-Armstrong, K. G. Derpanis, J. Kelly, M. A. Brubaker, I. Gilitschenski, and A. Levinshtein, “SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields,” 2023. [Online]. Available: http://arxiv.org/abs/2211.12254
  11. R. Goel, D. Sirikonda, S. Saini, and P. J. Narayanan, “Interactive Segmentation of Radiance Fields,” 2023. [Online]. Available: http://arxiv.org/abs/2212.13545
  12. R. Suvorov, E. Logacheva, A. Mashikhin, A. Remizova, A. Ashukha, A. Silvestrov, N. Kong, H. Goka, K. Park, and V. Lempitsky, “Resolution-robust Large Mask Inpainting with Fourier Convolutions,” in 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 3172–3182.
  13. Y. Hao, Y. Liu, Z. Wu, L. Han, Y. Chen, G. Chen, L. Chu, S. Tang, Z. Yu, Z. Chen, and B. Lai, “EdgeFlow: Achieving Practical Interactive Segmentation with Edge-Guided Flow,” in 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2021, pp. 1551–1560.
  14. M. Caron, H. Touvron, I. Misra, H. Jegou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging Properties in Self-Supervised Vision Transformers,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9630–9640.
  15. T. Zhou, F. Porikli, D. J. Crandall, L. Van Gool, and W. Wang, “A survey on deep learning technique for video segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, pp. 7099–7122, 2023.
  16. S. Zhi, T. Laidlow, S. Leutenegger, and A. J. Davison, “In-Place Scene Labelling and Understanding with Implicit Scene Representation,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15 818–15 827.
  17. S. Kobayashi, E. Matsumoto, and V. Sitzmann, “Decomposing NeRF for Editing via Feature Field Distillation,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35, 2022, pp. 23 311–23 330. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/93f250215e4889119807b6fac3a57aec-Paper-Conference.pdf
  18. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning Transferable Visual Models From Natural Language Supervision,” in Proceedings of the 38th International Conference on Machine Learning, 2021, pp. 8748–8763. [Online]. Available: https://proceedings.mlr.press/v139/radford21a.html
  19. J. L. Schönberger, T. Price, T. Sattler, J.-M. Frahm, and M. Pollefeys, “A vote-and-verify strategy for fast spatial verification in image retrieval,” in Asian Conference on Computer Vision (ACCV), 2016.
  20. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment Anything,” 2023. [Online]. Available: http://arxiv.org/abs/2304.02643
  21. A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su, “TensoRF: Tensorial Radiance Fields,” in Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, 2022, pp. 333–350. [Online]. Available: https://doi.org/10.1007/978-3-031-19824-3_20
  22. Z. Qiu, T. Yao, and T. Mei, “Learning deep spatio-temporal dependence for semantic video segmentation,” IEEE Transactions on Multimedia, vol. 20, no. 4, pp. 939–949, 2018.
  23. L. Wang and C. Jung, “Example-based video stereolization with foreground segmentation and depth propagation,” IEEE Transactions on Multimedia, vol. 16, no. 7, pp. 1905–1914, 2014.
  24. L. Zhao, H. Zhou, X. Zhu, X. Song, H. Li, and W. Tao, “Lif-seg: Lidar and camera image fusion for 3d lidar semantic segmentation,” IEEE Transactions on Multimedia, pp. 1–11, 2023.
  25. A. H. Abdulnabi, B. Shuai, Z. Zuo, L.-P. Chau, and G. Wang, “Multimodal recurrent neural networks with information transfer layers for indoor scene labeling,” IEEE Transactions on Multimedia, vol. 20, no. 7, pp. 1656–1671, 2018.
  26. Z. Fan, P. Wang, Y. Jiang, X. Gong, D. Xu, and Z. Wang, “NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes,” 2022. [Online]. Available: http://arxiv.org/abs/2209.08776
  27. X. Liu, J. Chen, H. Yu, Y.-W. Tai, and C.-K. Tang, “Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35, 2022, pp. 17 730–17 743. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/70de9e3948645a1be2de657f14d85c6d-Paper-Conference.pdf
  28. M. Wallingford, A. Kusupati, A. Fang, V. Ramanujan, A. Kembhavi, R. Mottaghi, and A. Farhadi, “Neural Radiance Field Codebooks,” in ICLR, 2023. [Online]. Available: https://openreview.net/forum?id=mX56bKDybu5
  29. X. Fu, S. Zhang, T. Chen, Y. Lu, L. Zhu, X. Zhou, A. Geiger, and Y. Liao, “Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation,” 2022. [Online]. Available: http://arxiv.org/abs/2203.15224
  30. C. Bao, Y. Zhang, B. Yang, T. Fan, Z. Yang, H. Bao, G. Zhang, and Z. Cui, “SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field,” 2023. [Online]. Available: http://arxiv.org/abs/2303.13277
  31. S. Benaim, F. Warburg, P. E. Christensen, and S. Belongie, “Volumetric Disentanglement for 3D Scene Manipulation,” 2022. [Online]. Available: http://arxiv.org/abs/2206.02776
  32. A. Mikaeili, O. Perel, D. Cohen-Or, and A. Mahdavi-Amiri, “SKED: Sketch-guided Text-based 3D Editing,” 2023. [Online]. Available: http://arxiv.org/abs/2303.10735
  33. E. Sella, G. Fiebelman, P. Hedman, and H. Averbuch-Elor, “Vox-E: Text-guided Voxel Editing of 3D Objects,” 2023. [Online]. Available: http://arxiv.org/abs/2303.12048
  34. Z. Chen, K. Yin, and S. Fidler, “AUV-Net: Learning Aligned UV Maps for Texture Transfer and Synthesis,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1455–1464.
  35. K. Rematas, R. Martin-Brualla, and V. Ferrari, “Sharf: Shape-conditioned Radiance Fields from a Single View,” in Proceedings of the 38th International Conference on Machine Learning, 2021, pp. 8948–8958. [Online]. Available: https://proceedings.mlr.press/v139/rematas21a.html
  36. H.-X. Yu, L. Guibas, and J. Wu, “Unsupervised Discovery of Object Radiance Fields,” in ICLR, 2022. [Online]. Available: https://openreview.net/forum?id=rwE8SshAlxw
  37. H.-K. Liu, I.-C. Shen, and B.-Y. Chen, “NeRF-In: Free-Form NeRF Inpainting with RGB-D Priors,” 2022. [Online]. Available: http://arxiv.org/abs/2206.04901
  38. V. Lazova, V. Guzov, K. Olszewski, S. Tulyakov, and G. Pons-Moll, “Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation,” 2022. [Online]. Available: http://arxiv.org/abs/2204.10850
  39. B. Wang, L. Chen, and B. Yang, “DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images,” in ICLR, 2023. [Online]. Available: https://openreview.net/forum?id=C_PRLz8bEJx
  40. S. Liu, X. Zhang, Z. Zhang, R. Zhang, J.-Y. Zhu, and B. Russell, “Editing Conditional Radiance Fields,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 5753–5763.
  41. J. Zhu, Y. Huo, Q. Ye, F. Luan, J. Li, D. Xi, L. Wang, R. Tang, W. Hua, H. Bao, and R. Wang, “I$2̂$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs,” 2023. [Online]. Available: http://arxiv.org/abs/2303.07634
  42. W. Ye, S. Chen, C. Bao, H. Bao, M. Pollefeys, Z. Cui, and G. Zhang, “IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis,” 2023. [Online]. Available: http://arxiv.org/abs/2210.00647
  43. A. Mirzaei, Y. Kant, J. Kelly, and I. Gilitschenski, “LaTeRF: Label and Text Driven Object Radiance Fields,” in Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, 2022, pp. 20–36. [Online]. Available: https://doi.org/10.1007/978-3-031-20062-5_2
  44. Z. Kuang, F. Luan, S. Bi, Z. Shu, G. Wetzstein, and K. Sunkavalli, “PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields,” 2023. [Online]. Available: http://arxiv.org/abs/2212.10699
  45. B. Li, K. Q. Weinberger, S. Belongie, V. Koltun, and R. Ranftl, “Language-driven Semantic Segmentation,” in ICLR, 2022. [Online]. Available: https://openreview.net/forum?id=RriDjddCLN
  46. V. Tschernezki, I. Laina, D. Larlus, and A. Vedaldi, “Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations,” in 2022 International Conference on 3D Vision (3DV), 2022, pp. 443–453.
  47. K. Deng, A. Liu, J.-Y. Zhu, and D. Ramanan, “Depth-supervised NeRF: Fewer Views and Faster Training for Free,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12 872–12 881.
  48. J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., 2016, pp. 694–711.
  49. IDEA-Research, “Grounded-sam,” https://github.com/IDEA-Research/Grounded-Segment-Anything, 2023.
  50. S. Liu, Z. Zeng, T. Ren, F. Li, H. Zhang, J. Yang, C. Li, J. Yang, H. Su, J. Zhu et al., “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” arXiv preprint arXiv:2303.05499, 2023.
  51. Q. Wang, Z. Wang, K. Genova, P. Srinivasan, H. Zhou, J. T. Barron, R. Martin-Brualla, N. Snavely, and T. Funkhouser, “Ibrnet: Learning multi-view image-based rendering,” in CVPR, 2021.
  52. B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ramamoorthi, R. Ng, and A. Kar, “Local light field fusion: Practical view synthesis with prescriptive sampling guidelines,” ACM Transactions on Graphics (TOG), 2019.
  53. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The Unreasonable Effectiveness of Deep Features as a Perceptual Metric,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 586–595.
  54. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “GANs trained by a two time-scale update rule converge to a local nash equilibrium,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17, 2017, pp. 6629–6640.
  55. L. Zhang and M. Agrawala, “Adding Conditional Control to Text-to-Image Diffusion Models,” 2023. [Online]. Available: http://arxiv.org/abs/2302.05543
  56. A. Haque, M. Tancik, A. A. Efros, A. Holynski, and A. Kanazawa, “Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions,” 2023. [Online]. Available: http://arxiv.org/abs/2303.12789
  57. A. Raj, S. Kaza, B. Poole, M. Niemeyer, N. Ruiz, B. Mildenhall, S. Zada, K. Aberman, M. Rubinstein, J. Barron, Y. Li, and V. Jampani, “DreamBooth3D: Subject-Driven Text-to-3D Generation,” 2023. [Online]. Available: http://arxiv.org/abs/2303.13508
  58. B. Poole, A. Jain, J. T. Barron, and B. Mildenhall, “DreamFusion: Text-to-3D using 2D Diffusion,” in ICLR, 2023. [Online]. Available: https://openreview.net/forum?id=FjNys5c7VyY
  59. Z. Zhou and S. Tulsiani, “SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction,” 2023. [Online]. Available: http://arxiv.org/abs/2212.00792
  60. U. Singer, S. Sheynin, A. Polyak, O. Ashual, I. Makarov, F. Kokkinos, N. Goyal, A. Vedaldi, D. Parikh, J. Johnson, and Y. Taigman, “Text-To-4D Dynamic Scene Generation,” 2023. [Online]. Available: http://arxiv.org/abs/2301.11280
  61. S. Cao, W. Chai, S. Hao, Y. Zhang, H. Chen, and G. Wang, “Difffashion: Reference-based fashion design with structure-aware transfer by diffusion models,” IEEE Transactions on Multimedia, pp. 1–13, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Youtan Yin (3 papers)
  2. Zhoujie Fu (5 papers)
  3. Fan Yang (878 papers)
  4. Guosheng Lin (157 papers)
Citations (25)

Summary

We haven't generated a summary for this paper yet.