3D scene generation from scene graphs and self-attention (2404.01887v3)

Published 2 Apr 2024 in cs.CV

Abstract: Synthesizing realistic and diverse indoor 3D scene layouts in a controllable fashion opens up applications in simulated navigation and virtual reality. As concise and robust representations of a scene, scene graphs have proven to be well-suited as the semantic control on the generated layout. We present a variant of the conditional variational autoencoder (cVAE) model to synthesize 3D scenes from scene graphs and floor plans. We exploit the properties of self-attention layers to capture high-level relationships between objects in a scene, and use these as the building blocks of our model. Our model, leverages graph transformers to estimate the size, dimension and orientation of the objects in a room while satisfying relationships in the given scene graph. Our experiments shows self-attention layers leads to sparser (7.9x compared to Graphto3D) and more diverse scenes (16%).

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (7)

Pietro Bonazzi (11 papers)
Mengqi Wang (35 papers)
Diego Martin Arroyo (7 papers)
Fabian Manhardt (41 papers)
Nico Messikomer (1 paper)
Federico Tombari (214 papers)
Davide Scaramuzza (190 papers)

3D scene generation from scene graphs and self-attention (2404.01887v3)

Related Papers