GenN2N: Generative NeRF2NeRF Translation (2404.02788v1)

Published 3 Apr 2024 in cs.CV

Abstract: We present GenN2N, a unified NeRF-to-NeRF translation framework for various NeRF translation tasks such as text-driven NeRF editing, colorization, super-resolution, inpainting, etc. Unlike previous methods designed for individual translation tasks with task-specific schemes, GenN2N achieves all these NeRF editing tasks by employing a plug-and-play image-to-image translator to perform editing in the 2D domain and lifting 2D edits into the 3D NeRF space. Since the 3D consistency of 2D edits may not be assured, we propose to model the distribution of the underlying 3D edits through a generative model that can cover all possible edited NeRFs. To model the distribution of 3D edited NeRFs from 2D edited images, we carefully design a VAE-GAN that encodes images while decoding NeRFs. The latent space is trained to align with a Gaussian distribution and the NeRFs are supervised through an adversarial loss on its renderings. To ensure the latent code does not depend on 2D viewpoints but truly reflects the 3D edits, we also regularize the latent code through a contrastive learning scheme. Extensive experiments on various editing tasks show GenN2N, as a universal framework, performs as well or better than task-specific specialists while possessing flexible generative power. More results on our project page: https://xiangyueliu.github.io/GenN2N/

References (51)

Authors (5)

Xiangyue Liu (13 papers)
Han Xue (20 papers)
Kunming Luo (18 papers)
Ping Tan (101 papers)
Li Yi (111 papers)

Citations (2)

View on Semantic Scholar

Summary

Overview of GenN2N: Enhancing NeRFs through Generative Translation

The paper introduces GenN2N, a comprehensive framework for executing NeRF-to-NeRF translation to perform diverse 3D NeRF editing tasks such as text-driven editing, colorization, super-resolution, and inpainting. This unification contrasts with existing task-specific NeRF editing schemes which rely heavily on specialized, domain-specific knowledge. GenN2N innovatively integrates 2D image editing approaches to streamline these varied NeRF alterations while maintaining the multi-view consistency vital to coherent 3D scenes.

Key Concepts and Methodology

GenN2N's methodology can be segmented into two primary operational stages:

2D Image-to-Image Editing: Utilizing robust plug-and-play image-to-image translators, GenN2N applies precise 2D edits to render images generated from NeRFs. These 2D domains edits span common tasks such as colorization and super-resolution, leveraging the extensive adaptability and effectiveness of existing 2D editing tools.
3D NeRF Adaptation: Post 2D-editing, the framework translates these edits into the 3D NeRF model. This adaptation addresses the pivotal challenge of ensuring 3D visual consistency throughout diverse views by modeling the distribution of potential 3D edits via a variational autoencoder (VAE) enhanced with generative adversarial network (GAN) components. The edit consistency is preserved via a carefully structured latent space aligned with a Gaussian distribution, augmented by contrastive learning principles to disentangle 3D alterations from variable 2D viewpoints.

Experimental Results and Implications

The experiments covered diverse datasets and verified the efficiency of GenN2N across various neural radiance applications. Notably, the framework demonstrated competencies surpassing or matching task-specialist methods, simultaneously offering significant flexibility in generating diverse NeRF outputs. The extensive results underscore GenN2N's potential in simplifying the creation and customization of 3D models, allowing for high-quality rendering outputs without task-dependent methodological modifications.

Implications for Future AI Development

The impact of GenN2N extends beyond its immediate application, hinting at broader capabilities for integrating complex 2D generative models into 3D analytics and broader AI systems. This approach might redefine the foundations for processing and manipulating high-quality 3D content, potentially leading to innovations in VR, AR, and various immersive technologies. The generative capabilities, alongside the adaptability of GenN2N, position it as a meaningful contributor to future research in transcending traditional 3D model optimization barriers.

Overall, GenN2N represents a significant stride in leveraging generative models for multi-modal applications in neural rendering, setting the stage for subsequent advancements in the seamless integration and translation of neurally generated visual spaces.

PDF Markdown

Related Papers

Find Related Papers

GitHub

Reddit

[2404.02788] GenN2N: Generative NeRF2NeRF Translation (2 points, 1 comment)