Papers
Topics
Authors
Recent
Search
2000 character limit reached

Dynamic Neural Radiance Fields

Updated 28 February 2026
  • Dynamic Neural Radiance Fields are an extension of static NeRFs that model time-varying, non-rigid 3D scenes using spatiotemporal functions.
  • They employ deformation networks to warp points into a canonical space, enabling photorealistic rendering via integrated view-dependent color and density predictions.
  • Advanced techniques like tensor factorization, grid encodings, and regularization methods accelerate training while effectively handling occlusion and topological changes.

Dynamic Neural Radiance Fields (Dynamic NeRFs) generalize the static NeRF framework to model time-varying scenes, enabling photorealistic novel-view and novel-time synthesis of dynamic, non-rigid, and long-duration 3D environments. They represent a function mapping spatiotemporal coordinates and viewing direction to view-dependent emitted color and volume density, supporting advanced tasks such as dynamic reconstruction, free-viewpoint video, and editable 3D content.

1. Mathematical Formulation and Canonicalization

Dynamic NeRFs extend the static NeRF formulation F:R3×S2→R3×R+F: \mathbb{R}^3 \times S^2 \rightarrow \mathbb{R}^3 \times \mathbb{R}_+ to a spatiotemporal function F:R3×S2×[0,1]→R3×R+F: \mathbb{R}^3 \times S^2 \times [0,1] \rightarrow \mathbb{R}^3 \times \mathbb{R}_+. Given a 3D point xx, view direction dd, and continuously-valued time tt, the network predicts color cc and density σ\sigma for volume rendering.

Many dynamic NeRFs, starting with D-NeRF (Pumarola et al., 2020), decompose this mapping into two stages:

  • Deformation Network Ψt\Psi_t predicts a residual displacement Δx=Ψt(x,t)\Delta x = \Psi_t(x, t), which warps the query point back to a shared "canonical" space: x′=x+Δxx' = x + \Delta x.
  • Canonical NeRF gÏ•g_\phi predicts density and radiance in canonical space: (σ,c)=gÏ•(x′,d)(\sigma, c) = g_\phi(x', d).

The volumetric color is rendered by sampling points along each camera ray at time tt, warping via Ψt\Psi_t, querying gϕg_\phi, then integrating with the standard transmittance-weighted quadrature.

This canonicalization approach is robust to non-rigid deformation and supports time- and view-continuous rendering. The architectural motif is preserved in subsequent advances (Jang et al., 2022, Guo et al., 2022).

2. Representation Strategies and Acceleration

Dynamic NeRF research has diverged into several representational classes:

  • MLP-Based Deformations: D-NeRF and H-NeRF (Xu et al., 2021) parameterize both deformation and radiance fields purely with multi-layer perceptrons and often enforce constraints such as Δx=0\Delta x=0 at canonical time.
  • Voxel and Grid-Based Encodings: To accelerate training/inference, methods such as Neural Deformable Voxel Grid (NDVG) (Guo et al., 2022) and D-TensoRF (Jang et al., 2022) replace most of the MLPs with explicit 3D or 4D tensor grids, utilizing trilinear/quadrilinear interpolation and lightweight decoders. D-TensoRF introduces tensor decomposition by Canonical Polyadic (CP) and Matrix–Matrix (MM) methods, representing the joint space (xyz, time) as a low-rank set of factors, offering 10–40x speedup and enabling real-time dynamic NeRFs.
  • Particle Encodings: Online and high-adaptation approaches, such as ParticleNeRF (Abou-Chakra et al., 2022), use a set of dynamic neural particles whose positions and features are optimized continuously via backpropagated photometric gradients interpreted as physics-like velocity updates.
Method Canonicalization Representation Acceleration
D-NeRF MLP deform. fully MLP -
D-TensoRF none (tensor grid time) 4D grid + CP/MM tensor factoriz.
NDVG MLP+grid deform. 3D grids + small MLP explicit trilinear
ParticleNeRF none dynamic particles physics-based online

Efficient approaches such as InstantNGP-based hash grids and multi-resolution factorization are frequently employed for both speed and compactness (Quartey et al., 2022, Abou-Chakra et al., 2022).

3. Handling Occlusion, Topology, and Scene Flow

Dynamic NeRFs incorporate specialized modules to account for occlusions, disocclusions, and topological changes:

  • Occlusion Modeling: NDVG (Guo et al., 2022) augments the deformation network with an occlusion weight wocc∈[0,1]w^{\text{occ}}\in[0,1] per sample, modulating the contribution of each point after deformation to suppress ghosts and background leaking into foreground regions.
  • Temporal Regularization: D-TensoRF applies smoothing regularization to time factors and matrix slices to promote continuity across frames, essential for temporal coherence and mitigating motion artifacts (Jang et al., 2022).
  • Scene Flow and Motion Fields: Methods like VDNeRF (Zou et al., 9 Nov 2025) and H-NeRF (Xu et al., 2021) explicitly integrate forward/backward scene flow or leverage pre-fitted body models (imGHUM) to provide physically meaningful temporal correspondences, vital for disambiguating camera and object motion in dynamic, real-world urban or articulated scenes.

4. Training Objectives and Data Regimes

Dynamic NeRFs are trained end-to-end with photometric reconstruction losses over batches of rays drawn from multi-view (or monocular) sequences. Losses include:

Dynamic NeRFs operate on varying acquisition regimes, from sparse camera arrays (H-NeRF, VDNeRF) to monocular videos with inferred poses (D-NeRF), to on-the-fly continually streaming inputs (Yan et al., 2023, Abou-Chakra et al., 2022).

5. Editing, Compression, and Streaming

Recent work expands dynamic NeRFs beyond reconstruction towards interactive applications:

  • Editing: SealD-NeRF (Huang et al., 2024) enables interactive, pixel-level editing of dynamic sequences, mapping single-frame user edits to temporally consistent canonical changes via a teacher-student scheme with the deformation network frozen, ensuring that edits propagate seamlessly along prescribed motion without distorting dynamics.
  • Compression: Methods such as D-TensoRF and VideoRF (Jang et al., 2022, Wang et al., 2023) serialize dynamic radiance fields into highly compressible grids or 2D video streams amenable to hardware codecs. Techniques include low-rank tensor decompositions (CP/MM), 3D-to-2D Morton packing, and spatial/temporal TV regularization, achieving model sizes down to 1–10 MB for hundreds of frames and enabling real-time mobile playback.
  • On-the-Fly/Online Adaptation: OD-NeRF (Yan et al., 2023) and ParticleNeRF (Abou-Chakra et al., 2022) support low-latency streaming and rapid reconstruction from sequential video, employing occupancy grid transitions, projected-color conditioning, and particle-based adaptation for framewise retraining at 6–200 ms per update.

6. Specialized Extensions: HDR, Specularities, and Hybrid Rendering

  • HDR Dynamic NeRFs: HDR-HexPlane (Wu et al., 2024) extends HexPlane to accommodate dynamic scenes with variable exposure, learning a per-image exposure mapping and fixing a monotonic camera response function (CRF) for stable optimization. Volumetric HDR and LDR rendering is performed with a 4D grid factored into six planes, achieving high-quality, exposure-robust free-viewpoint renderings.
  • Specular/Dynamic Reflective Objects: NeRF-DS (Yan et al., 2023) addresses the domain gap for non-Lambertian, specular dynamic objects by conditioning the radiance branch on observation-space position and surface normal, combined with a mask-guided deformation field to handle correspondence in challenging reflective motion.
  • Hybrid Mesh-Volumetric Systems: Dynamic Mesh-Aware Radiance Fields (Qiao et al., 2023) proposes a two-way coupling of NeRF volumes and explicit meshes, developing a unified light-transport system by interleaving NeRF ray marching with mesh path tracing, HDR training, and GPU-accelerated physics, allowing physically consistent dynamic scene simulation and real-time hybrid rendering.

7. Limitations, Quantitative Performance, and Future Directions

Dynamic NeRFs face challenges including:

  • Ambiguity from Sparse Data: Monocular or sparsely-viewed dynamic captures (especially with fast motion or occlusion) remain ill-posed without additional constraints or priors.
  • Computational Efficiency: While grid and tensor factorization methods (e.g., D-TensoRF, NDVG) offer significant speed and compactness improvements, high-fidelity online streaming with large spatiotemporal coverage remains demanding (Jang et al., 2022, Guo et al., 2022).
  • Topological Change and Long Duration: Methods such as NeVRF (Wu et al., 2023) and VideoRF (Wang et al., 2023) demonstrate scalability to long temporal sequences and complex topology, but with tradeoffs in reconstruction granularity and storage.

Empirically, state-of-the-art dynamic NeRFs achieve PSNR ≈29\approx 29–33 dB, SSIM $0.88$–0.98, and LPIPS $0.03$–0.08 on synthetic and real benchmarks, with training times reducing from tens of hours (D-NeRF) to minutes (D-TensoRF, NDVG, ParticleNeRF), and storage below 10 MB for 100–200 frame sequences in compressed representations (Jang et al., 2022, Abou-Chakra et al., 2022).

Anticipated future advances include deeper integration of physical priors (scene flow, SDFs, human models), real-time editable dynamic scenes, end-to-end joint compression and representation, and extension to unbounded environments and continual streaming video scenarios (Zheng et al., 2024, Zou et al., 9 Nov 2025, Yan et al., 2023).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Dynamic Neural Radiance Fields (NeRF).