DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning (2503.15265v1)

Published 19 Mar 2025 in cs.CV

Abstract: Triangle meshes play a crucial role in 3D applications for efficient manipulation and rendering. While auto-regressive methods generate structured meshes by predicting discrete vertex tokens, they are often constrained by limited face counts and mesh incompleteness. To address these challenges, we propose DeepMesh, a framework that optimizes mesh generation through two key innovations: (1) an efficient pre-training strategy incorporating a novel tokenization algorithm, along with improvements in data curation and processing, and (2) the introduction of Reinforcement Learning (RL) into 3D mesh generation to achieve human preference alignment via Direct Preference Optimization (DPO). We design a scoring standard that combines human evaluation with 3D metrics to collect preference pairs for DPO, ensuring both visual appeal and geometric accuracy. Conditioned on point clouds and images, DeepMesh generates meshes with intricate details and precise topology, outperforming state-of-the-art methods in both precision and quality. Project page: https://zhaorw02.github.io/DeepMesh/

Summary

The paper introduces DeepMesh, a novel auto-regressive method for generating artistically refined 3D meshes using a refined pre-training strategy and reinforcement learning to align outputs with human preferences.
DeepMesh employs a refined mesh tokenization algorithm that compresses sequences by approximately 72%, enhancing pre-training efficiency and stability for large transformer models up to 1 billion parameters.
Integrating Reinforcement Learning via Direct Preference Optimization (DPO) allows DeepMesh to align generated meshes with human aesthetic and geometric preferences, yielding diverse and high-fidelity results.

DeepMesh: Enhancing 3D Mesh Generation Using Reinforcement Learning

The paper "DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning" introduces a novel approach to the generation of 3D triangle meshes, which are foundational in various industrial applications such as virtual reality, gaming, and animation. DeepMesh addresses the limitations seen in prior auto-regressive mesh generation methods, which often grapple with incomplete meshes and low face counts by employing an innovative combination of a refined pre-training strategy and reinforcement learning techniques.

Key Innovations in DeepMesh

DeepMesh’s contribution centers around two pivotal innovations:

Refined Pre-training Strategy: The authors have developed an enhanced mesh tokenization algorithm that significantly compresses mesh sequences by approximately 72% without forfeiting geometric details. This method not only reduces computational expenses but also stabilizes training by incorporating a strategic data curation and packaging methodology. Pre-training efficiency is further optimized by implementation techniques such as truncated training and advanced loading strategies, allowing the model to handle large-scale transformer architectures ranging from 500 million to 1 billion parameters effectively.
Integration of Reinforcement Learning: The introduction of Reinforcement Learning within the 3D mesh generation context, specifically through Direct Preference Optimization (DPO), allows DeepMesh to produce meshes that align with human preferences. This approach utilizes a scoring standard that amalgamates human judgment with conventional 3D metrics to prioritize sample selections during training, leading to outputs that meet both aesthetic and geometric standards.

Contributions and Methodology

In the auto-regressive generation of artistically refined meshes, DeepMesh offers substantial advancements:

Tokenization Algorithm: The proposed algorithm effectively tokenizes high-resolution meshes with reduced sequence lengths while maintaining compact vocabulary size, enhancing training feasibility.
Pre-training Execution: With refined strategies for data preparation and training, DeepMesh ensures stability even with large-scale and diverse datasets, reinforcing its capability to train large transformers proficiently.
Human Preference Alignment: By collecting explicit preference pairs and employing DPO, the model aligns its outputs with aesthetic and geometric human standards, yielding diverse and high-fidelity meshes that outperform state-of-the-art methods in terms of precision and quality.

Implications and Future Directions

The implications of this research are twofold: practical enhancements in the precision and appeal of auto-generated 3D meshes and a theoretical expansion in utilizing reinforcement learning frameworks for artistic content generation. As industries increasingly rely on AI-driven design and modeling, methods like DeepMesh that ensure geometric accuracy and visual quality are invaluable.

Looking ahead, further refinement in the point cloud encoder may boost the model’s detail replicating capabilities, while expanding training datasets could enhance generalizability across varied 3D forms. Moreover, the scalability and performance of larger models remain promising avenues for exploration, potentially improving upon generation quality and supporting more complex applications. This exploration could also bolster understanding and application of RLHF (Reinforcement Learning from Human Feedback) strategies across AI and machine learning landscapes, particularly in domains demanding human-like creativity and aesthetic judgment.

Related Papers

Find Related Papers

GitHub

Tweets

https://twitter.com/coolboywzy/status/1904077241010774209

https://twitter.com/_akhaliq/status/1902713300401471687

https://twitter.com/taziku_co/status/1903325282842579049

https://twitter.com/ZRuowen/status/1904076157982806311

https://twitter.com/ZRuowen/status/1904172047129104566

https://twitter.com/eddieyoon/status/1903284416304255322