- The paper presents a unified pipeline that fuses 3D Gaussian Splatting with physics-based fire simulation and MLLM-driven material reasoning.
- It achieves high control over combustion dynamics and scene realism, validated by quantitative scores and extensive user studies.
- The framework is efficient (2.37s/frame, <10GB GPU) and scalable, offering promising applications in AR/VR, safety training, and visual effects.
FieryGS: Physics-Integrated Gaussian Splatting for In-the-Wild Fire Synthesis
Motivation and Background
Accurate and controllable synthesis of physically plausible fire dynamics in complex, real-world 3D scenes is a central challenge in simulation and visual effects. Traditional CFD and graphics pipelines, while physically grounded, are constrained by manual specification, asset modeling, and extensive parameter tuning, ultimately yielding significant sim-to-real gaps and limited scalability. Data-driven approaches applied directly to video or neural scene representations offer automation, yet lack material abstraction and physical control critical to realistic combustion. Recent advances in 3D Gaussian Splatting (3DGS) have pushed the boundaries of photorealistic, real-world-aligned reconstruction, but remain disconnected from physical phenomena such as fire, which demand intricate material reasoning, volumetric simulation, and physically consistent rendering.
Methodology
FieryGS presents a unified framework, tightly coupling multimodal large-language-model (MLLM) material inference, volumetric physics-based fire simulation, and a novel renderer seamlessly integrating combustion phenomena with 3D scene representations.
Material Reasoning via Multimodal LLMs
FieryGS initiates by reconstructing high-fidelity 3DGS models from multi-view images using PGSR, yielding precise surface geometry and appearance. 3D Gaussian grouping and segmentation are performed with SAM and SAGA, extracting coherent material regions. Each region is rasterized onto 2D views and subjected to GPT-4o for zero-shot inference of combustion-relevant physical properties, including material type, burnability, thermal diffusivity, and expected smoke color. View selection maximizing region visibility optimizes MLLM accuracy and robustness. Results are projected back onto the 3DGS, producing a physically annotated scene representation and defining a combustion-aware occupancy grid for simulation.
Figure 1: Hierarchical visual and textual prompts employed in GPT-4o enable accurate segmentation and combustion property inference across highly complex scene regions.
Volumetric Combustion Simulation
Combustion simulation occurs in two domains: air regionsโwhere flame and smoke evolution are resolvedโand solid regionsโwhere heat transfer and charring are simulated. Incompressible flow models for flame dynamics balance visually essential features (buoyancy-driven turbulence, propagation) with computational efficiency, avoiding costly compressible formulations. Charring computes surface carbonization via heat diffusion and temperature thresholds; mass loss and structural degradation are omitted to retain tractability. All parameters, from ignition points, fire intensity (ฮฑ), and wind direction, to thermal diffusivity (ฮฒ) and charring rate (ฮตcโ), are directly user-controllable, supporting granular manipulation of fire behavior.
Unified Rendering Pipeline
FieryGS integrates simulated fire, smoke, and 3DGS in a volumetric renderer. Fire self-emission is modeled via Planckโs law, chromatic adaptation, and spectral-to-RGB conversion. Smoke rendering follows material-specific color inference. Charring is implemented by progressive dimming and color scaling of affected 3DGS regions. A Phong illumination pass accounts for fire-induced lighting, employing normal maps extracted from the 3DGS. To further enhance realism, an optional diffusion-based generative video refinement corrects artifacts and augments photometric effects.
Experimental Results
FieryGS demonstrates superior performance across six representative indoorโoutdoor scenes, including challenging real-world scenarios with complex objects and geometries.
Quantitative Analysis: FieryGS yields highest Aesthetic Quality (0.624) and Imaging Quality (0.702) scores on VBench, and achieves the lowest DINO Structure Score (0.38), evidencing faithful scene preservation and visual fidelity relative to baselines including AutoVFX, Runway-V2V, and Instruct-GS2GS.
User Studies: Two comprehensive AMT studies (86 and 88 participants) on image/video outputs demonstrate consistent preference for FieryGS in both perceptual realism and physical plausibility. Notably, preference rates against AutoVFX are 88.9%/77.8% (realism) and 86.6%/85.5% (physicality).
Figure 2: AMT user study interface utilized to assess realism and physical plausibility, showing clear preference for FieryGS outputs across diverse scenes and modalities.
Qualitative Comparison: Baseline models either fail to retain scene structure (Runway-V2V), produce implausible static edits (Instruct-GS2GS), or lack combustion dynamics and smoke materialization (AutoVFX). FieryGS maintains temporal and spatial consistency, replicates flame propagation and ignition behaviors, and adapts smoke color to inferred material properties.
Runtime: FieryGS achieves efficient frame-wise simulation (2.37s/frame, RTX 4090D), with less than 10GB GPU memory usageโorders of magnitude faster than existing physically based VFX pipelines.
Controllability and Physical Fidelity
Granular parameter control enables precise authoring of fire dynamics, including ignition locations, fire intensity (via ฮฑ, k), airflow/wind, and charring rates; this adaptability outpaces baseline systems which lack explicit or fine-grained control. Simulation captures flame spread across adjacent combustibles via thermal diffusion and models progressive surface carbonization.

Figure 3: Real-world combustion in a live-fire drill used to validate simulation alignment and to motivate the physical approach.
Practical and Theoretical Implications
FieryGS transcends labor-intensive workflows by producing automated, real-world-aligned, physically plausible fire synthesis via high-level language prompts and reconstructed 3D geometry. The integration of MLLM-based material reasoning and physics simulation within the 3DGS framework introduces possibilities for scalable AR/VR content generation, virtual safety training, robotics perception under hazardous conditions, and heritage preservationโwhile minimizing risks inherent to real-world fire testing.
On the theoretical front, FieryGS exemplifies the maturation of neural scene representations beyond static rendering, incorporating semantic, physical, and dynamic attributes inferred from multimodal data via LLMs. The framework suggests future research directions including robust mass loss modeling, advanced combustion dynamics, and scaling to large conflagrations.
Limitations
Several simplifications are incorporated for efficiency, notably omitting mass loss, structural collapse, and detailed flameโobject interactions. Non-uniform 3DGS distributions may induce simulation artifacts, and occasional MLLM misclassifications can lead to locally incorrect combustion behavior. Extension to large-scale phenomena such as forest fires requires methodological redesign.
Conclusion
FieryGS establishes a robust, scalable, and physically consistent pipeline for in-the-wild fire synthesis, tightly coupling high-fidelity 3DGS reconstruction, multimodal LLM material reasoning, efficient volumetric combustion simulation, and unified rendering. Experimental results confirm its advantages in realism, structure preservation, and controllability over prior baselines, supporting wide-ranging applications in simulation, entertainment, and safety. Future developments will address enhanced dynamics, scaling, and robustness in segmentation and material inference for broader scene generalization.