Fusion 360 Gym: Programmatic CAD Environment
- Fusion 360 Gym is an interactive research environment that recasts 3D CAD construction as a finite, deterministic Markov decision process using B-Rep face-adjacency graphs.
- It integrates a rich dataset of 8,625 human-designed Fusion 360 parts with a Pythonic API, supporting rigorous benchmark studies and reproducible CAD program induction.
- Empirical results demonstrate that graph-based policy models outperform MLPs and random agents, highlighting the value of structured geometric learning in CAD.
Fusion 360 Gym is an interactive research environment designed for programmatic computer-aided design (CAD) by recasting the stepwise construction of 3D geometry as a Markov decision process (MDP). Centered on a minimal subset of common CAD operations, the platform enables rigorous study of CAD program induction, policy learning, and search within the sequential assembly of parametric shapes. It is closely tied to the Fusion 360 Gallery, a dataset of 8,625 human-designed parts expressed in a programmatic form and annotated with geometric, construction, and file-level metadata (Willis et al., 2020).
1. Formalization as a Markov Decision Process
Fusion 360 Gym models CAD reconstruction as a finite, deterministic MDP: The state space comprises pairs , where is the current geometry in construction and is the fixed target geometry. Both are encoded as B-Rep (boundary representation) face-adjacency graphs , where vertices represent trimmed parametric faces and node features summarize local face geometry via a 10×10 grid of sampled 3D points, their normals, trim masks, and one-hot surface-type tags (e.g., plane, cylinder, sphere).
The primary action abstraction is face extrusion, where each action selects a pair of parallel planar faces in and a Boolean operation . Upon applying , the corresponding extrusion modifies 0 through the Fusion 360 API and the updated state is 1.
Reward functions are task-adaptive, with two canonical choices: an alignment reward given by 2 and an exact reconstruction reward awarding 3 only if 4. The environment uses 5 for finite-horizon processes.
2. Environment API and Protocol
Fusion 360 Gym interfaces with a headless Fusion 360 instance via a Pythonic protocol patterned on OpenAI Gym. The experimental workflow initializes the environment, sets the target geometry, and issues reset/step commands:
4
Observation dictionaries include the current B-Rep face graph, bounding box, IoU with the target, and extruded face information. Rewards typically match the iou field or a binary indicator for exact reconstruction. The protocol supports rendering (screenshots), mesh/B-Rep export, and graph serialization.
3. Integration with the Fusion 360 Gallery Dataset
The environment builds on the Fusion 360 Gallery, which contains 8,625 human-designed parts. Each entry includes:
- A JSON construction sequence (restricted to sketches and extrudes)
- Final B-Rep files (
.smt,.step) - Triangulated mesh (
.obj) - Thumbnail images (
.png)
Raw Fusion 360 files are parsed with the Python API, suppressing all non-sketch/extrude operations. Assemblies are decomposed into single-part designs and deduplication is performed via geometric metrics (body count, face count, volume, surface area). The set is split into 6,900 train and 1,725 test models. For Gym compatibility, test designs are loaded as .step files, normalized, and have their target face graphs extracted. Face-extrusion sequences are extracted from training data (~59% coverage), with remaining designs available for alternative benchmarks.
4. Learning and Search Workflows
Imitation and reinforcement learning protocols are supported by standardized interaction patterns. For imitation, datasets are assembled by replaying human construction sequences via the environment, collecting 6 triples.
Policy models are parameterized as: 7 using dual graph message-passing networks (MPNs) over 8 and 9: 0 Prediction for each action dimension employs softmaxes over graph features. Parameters are trained via cross-entropy on log likelihood, often using Adam.
Test-time rollout protocols include random sampling and beam search, with the latter exploring the space of legal actions exhaustively within a budget 1 and beam width 2.
5. Empirical Performance and Comparative Analysis
Under the face extrusion + random rollout regime (100-step horizon), learned graph-based policies significantly outperform both MLPs and random agents. Summary results are:
| Model | IoU @100 steps | Exact Reconstruction Rate | Conciseness (avg 3) |
|---|---|---|---|
| GAT (policy) | 0.9128 | 67.42% | 1.0206 |
| GCN | 0.9042 | 67.54% | 1.0168 |
| MLP | 0.8596 | 59.65% | — |
| Random agent | 0.8386 | 53.80% | — |
Ablation shows reliance on real human sequences: purely synthetic training yields substantial performance degradation (~20 IoU points). Augmented semi-synthetic data moderately narrows the gap. For search strategies (using GCN), random rollout exceeds beam and best-first search in exact match rates given step limits.
6. Architectural and Methodological Insights
Exploiting the B-Rep face-adjacency graph is critical. Graph neural network (GNN) architectures outpace flat MLPs, highlighting the value of structured geometric context in this domain. Real-world, human-designed construction sequences encode essential priors not captured by naive synthetic procedural data. Within tight action budgets, random rollouts of trained policies perform best, an observation that may reflect the locality and structured sparsity of valid CAD action spaces.
7. Research Significance and Applications
Fusion 360 Gym provides a reproducible, dataset-backed platform for the study of programmatic CAD construction from 3D geometry, combining a large-scale corpus of human-relevant models, well-defined sequential environments, and quantitative benchmarks. It serves as a foundation for research in program synthesis, neural-guided CAD, geometric reinforcement learning, and interpretable design automation (Willis et al., 2020). The modular design of its API and dataset further enables benchmarking of alternative modeling actions (such as sketch-extrusion) and integration with broader computational design methodologies.