Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 75 tok/s

Gemini 2.5 Pro 55 tok/s Pro

GPT-5 Medium 22 tok/s Pro

GPT-5 High 20 tok/s Pro

GPT-4o 113 tok/s Pro

Kimi K2 196 tok/s Pro

GPT OSS 120B 459 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion (2507.06165v1)

Published 8 Jul 2025 in cs.CV

Abstract: The creation of 3D assets with explicit, editable part structures is crucial for advancing interactive applications, yet most generative methods produce only monolithic shapes, limiting their utility. We introduce OmniPart, a novel framework for part-aware 3D object generation designed to achieve high semantic decoupling among components while maintaining robust structural cohesion. OmniPart uniquely decouples this complex task into two synergistic stages: (1) an autoregressive structure planning module generates a controllable, variable-length sequence of 3D part bounding boxes, critically guided by flexible 2D part masks that allow for intuitive control over part decomposition without requiring direct correspondences or semantic labels; and (2) a spatially-conditioned rectified flow model, efficiently adapted from a pre-trained holistic 3D generator, synthesizes all 3D parts simultaneously and consistently within the planned layout. Our approach supports user-defined part granularity, precise localization, and enables diverse downstream applications. Extensive experiments demonstrate that OmniPart achieves state-of-the-art performance, paving the way for more interpretable, editable, and versatile 3D content.

Summary

The paper presents a dual-stage generative framework that separates part structure planning from spatially-conditioned part synthesis for controllable 3D generation.
It employs an autoregressive transformer with mask guidance to predict explicit 3D part bounding boxes, ensuring semantic decoupling and structural cohesion.
Evaluated on a dataset of 180K annotated objects, OmniPart achieves superior geometric and semantic fidelity, enabling applications like compositional editing and material customization.

OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion

The paper introduces OmniPart, a framework designed for part-aware 3D object generation. It emphasizes semantic decoupling and structural cohesion to facilitate the creation of 3D models in interactive applications. By leveraging a dual-stage approach, OmniPart proposes a novel methodology for generating complex 3D assets with explicit, editable part structures.

Part Structure Planning and Generation

OmniPart's distinctive contribution lies in its two-stage generative framework. The initial stage focuses on structure planning using an autoregressive module to generate 3D part bounding boxes. This process is guided by 2D masks that enable intuitive control over part decomposition. These masks are manually delineated by users or extracted from pre-trained segmentation models such as SAM. As a result, the bounding boxes serve as spatial guides for assembling 3D parts.

Figure 1: An overview of the OmniPart model design. OmniPart generates part-aware, controllable, and high-quality 3D content through two key stages: part structure planning and structured part latent generation.

The second stage employs spatially-conditioned generation to synthesize high-quality 3D parts simultaneously. This stage restates a pre-trained holistic generator, TRELLIS, to produce parts with enhanced semantic awareness and structural coherence.

Implementation and Technical Details

OmniPart's implementation hinges on two core modules: the Controllable Structure Planning and the Spatially-Conditioned Part Synthesis. The structure planning module uses an autoregressive transformer to predict part layouts as bounding boxes. These predictions incorporate flexible mask-based conditions to accommodate varying part granularity or decomposition schemes.

The part synthesis module adapts TRELLIS by conditioning voxel-based regions within identified bounding boxes, and integrating part-aware embeddings to mediate local-global consistency. To achieve detailed outputs with limited annotations, a voxel discarding mechanism is introduced, which identifies and filters extraneous voxels early in the denoising process.

Figure 2: Spatially-conditioned part synthesis. Consistent generation of structured part latents ensures cohesion and quality in part-level outputs.

Datasets and Evaluation

OmniPart’s evaluation leverages a dataset comprising 180K 3D objects with detailed part-level annotations. The performance is benchmarked against existing segmentation-based and direct part generative methods, including Part123 and PartGen. Comparisons underline OmniPart’s ability to deliver superior part independence without compromising global cohesiveness.

The quantitative metrics deployed include Chamfer Distance and F1-score across multiple thresholds to ascertain both geometric and semantic fidelity at part and object levels.

Figure 3: Visualization of the training dataset. The dataset facilitates comprehensive evaluation through diverse part-count demonstrations.

Applications and Implications

OmniPart's flexible design fosters several downstream applications such as compositional editing, mask-controlled generation, and material customization. Tailored granularity control via 2D masks allows users to define specific structural patterns and apply independent texture modifications to parts.

Figure 4: Applications of our part-aware 3D generation framework. Part-aware outputs bolster a range of practical applications, demonstrating enhanced generation versatility.

The integration with structured latent representations boosts efficiency by concurrently synthesizing all parts and supporting high-quality geometrical processing. By achieving low semantic coupling across components, OmniPart pioneers a modular generation approach that holds significant implications for 3D-centric disciplines.

Conclusion

OmniPart emerges as a robust framework for part-aware 3D generation by tactically separating structural planning from detailed synthesis. Its innovative use of autoregressive models and adaptability to existing holistic generators mark a significant stride towards more interpretable and interactive 3D assets. Despite its reliance on axis-aligned bounding boxes, OmniPart sets a precedent for future endeavors in refining precision without detracting from the overarching aim of structural coordination.

Overall, OmniPart paves the way for more comprehensive and scalable 3D modeling, reinforcing its use in contemporary visual computing and design.