Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 81 tok/s

Gemini 2.5 Pro 44 tok/s Pro

GPT-5 Medium 22 tok/s Pro

GPT-5 High 25 tok/s Pro

GPT-4o 81 tok/s Pro

Kimi K2 172 tok/s Pro

GPT OSS 120B 434 tok/s Pro

Claude Sonnet 4 37 tok/s Pro

2000 character limit reached

PolyGen: An Autoregressive Generative Model of 3D Meshes (2002.10880v1)

Published 23 Feb 2020 in cs.GR, cs.CV, cs.LG, and stat.ML

Abstract: Polygon meshes are an efficient representation of 3D geometry, and are of central importance in computer graphics, robotics and games development. Existing learning-based approaches have avoided the challenges of working with 3D meshes, instead using alternative object representations that are more compatible with neural architectures and training approaches. We present an approach which models the mesh directly, predicting mesh vertices and faces sequentially using a Transformer-based architecture. Our model can condition on a range of inputs, including object classes, voxels, and images, and because the model is probabilistic it can produce samples that capture uncertainty in ambiguous scenarios. We show that the model is capable of producing high-quality, usable meshes, and establish log-likelihood benchmarks for the mesh-modelling task. We also evaluate the conditional models on surface reconstruction metrics against alternative methods, and demonstrate competitive performance despite not training directly on this task.

Citations (223)

View on Semantic Scholar

Summary

The paper introduces PolyGen, an autoregressive Transformer model that sequentially predicts mesh vertices and faces for 3D mesh generation.
It demonstrates significant performance gains, achieving 2.46 bits per vertex (85.1% accuracy) for vertices and 1.82 bits per vertex (90% accuracy) for faces.
Conditional mesh generation with PolyGen broadens its applications, enabling efficient 3D model synthesis for virtual simulations and robotics.

PolyGen: An Autoregressive Generative Model of 3D Meshes

The paper "PolyGen: An Autoregressive Generative Model of 3D Meshes" introduces a novel approach to directly model polygon meshes using deep learning. Polygon meshes, essential in computer graphics and robotics, present challenges for learning-based models due to their structure involving unordered elements and discrete face configurations. This work addresses these challenges through the proposed model, PolyGen, which leverages a Transformer-based architecture to sequentially predict mesh vertices and faces.

Technical Contributions

PolyGen is structured to explicitly autoregress over 3D mesh data, modeling the meshes as a joint distribution over vertices and faces. The model consists of two primary components:

Vertex Model: It generates mesh vertices unconditionally. The vertices are treated as sequences where the coordinate tuples (z, y, x) are processed, and a stopping token indicates sequence completion. The model uses a masked Transformer decoder to manage non-local dependencies present in mesh geometry.
Face Model: This component generates mesh faces conditioned on the previously predicted vertices using pointer networks combined with Transformers to handle the variable-length vertex sequences that define polygon faces.

The paper emphasizes the advantages of representing meshes with polygons of variable sizes, known as $n$ -gons, over traditional triangle meshes. This reduces redundancy by simplifying flat surfaces to single polygons, although with the acknowledgment that non-planar $n$ -gons require careful triangulation for rendering.

Numerical Results

The authors focus on evaluating the model's performance using log-likelihood and predictive accuracy metrics. The experiments show that PolyGen models achieve significant improvement over baseline methods, including uniform and Draco compression standards, in terms of bits-per-vertex and prediction accuracies. For example, the vertex model attains a score of 2.46 bits per vertex with an accuracy of 85.1%, and faces are modeled with 1.82 bits per vertex and 90% accuracy.

Conditional mesh generation was tested using different context inputs such as object classes, images, and voxels. Even without directly optimizing for mesh reconstruction tasks, PolyGen demonstrated competitiveness and superior sample diversity against other methods, like AtlasNet.

Implications and Future Directions

Practically, PolyGen offers enhancements in the automated creation of 3D models used in virtual simulations and robotics. The ability to condition mesh generation on numerous input types broadens its usability in various domains, such as content creation in virtual environments and real-time 3D vision tasks in robotics.

Theoretically, this research contributes to advancements in sequence modeling for discrete geometry, aligning with trends in deep learning that extend natural language processing innovations like Transformers to complex structured data. Future work might aim to refine this model by exploring higher bit-depths for mesh representation and improving computational efficiency, as addressed with alternative vertex models.

PolyGen sets a precedent for new 3D shape synthesis methodologies, potentially inviting further exploration into hybrid models incorporating both graph neural networks and advanced autoregressive techniques to enhance structural coherence and computational efficiency. This foundational research underlines the promise of integrating direct mesh modeling into the neural generative modeling landscape.