Multi-plane Light Imaging (MPLI)
- Multi-plane Light Imaging (MPLI) is a layered representation that encodes scene and lighting information as a stack of depth-aligned planes for versatile rendering and relighting.
- It employs both physical (diffractive optics) and computational (neural rendering) methods using homography warping, alpha compositing, and adaptive plane interactions.
- Applications span AR, holographic imaging, microscopy, and video relighting, demonstrating efficient, real-time synthesis with low crosstalk and high imaging fidelity.
A Multi-plane Light Image (MPLI) generalizes classic multi-layer visual representations by encoding scene or lighting information as a stack of planes aligned along depth. MPLI variants have been implemented across diffractive optics, scene relighting, computational photography, and light field synthesis. The unifying principle is that each “plane” encodes a 2D spatial function (irradiance, color, density, mask, etc.) at its associated depth, with application-specific rendering, inversion, or compositing methods. The structural basis and modern mathematical formalism of MPLI enable precise multi-depth, multi-band, and multi-directional control at both the hardware and software levels.
1. Mathematical Foundations of Multi-plane Light Imaging
MPLI representations instantiate the scene or light field as planes at increasing depths , each carrying a spatial map . Depending on context, may refer to color (MPI), opacity/density, physical irradiance (as in relighting), or features. Rendering a view from an MPLI proceeds via homography-based warping and alpha compositing:
where is the homography associated with plane , and and are the color and opacity functions for that plane. For physical light modeling, such as in RelightMaster (Bian et al., 9 Nov 2025), encodes per-pixel irradiance:
with (3D position), (intensity), and (color) for light , and the 3D location on plane .
Computationally, rendering or propagating light from an MPLI often requires layerwise transforms (e.g., Fresnel integral for optics, volume rendering for visual planes), incorporating wavelength, viewpoint, or lighting parameterization as needed.
2. Physical and Algorithmic Realizations
Optics: Broadband Diffractive Optical Elements
Physically-encoded MPLI uses a multi-level phase profile patterned (e.g., via grayscale lithography) onto a substrate to produce a Broadband Diffractive Optical Element (BDOE) (Meem et al., 2019). The output at each depth plane is computed via the Fresnel diffraction integral:
Here, encodes the BDOE's complex field modulation, and . The topographic profile is optimized (using direct binary search or gradient methods) to generate prescribed images at multiple and/or wavelengths. Key fabrication parameters include:
- Pixel pitch: 10–20 μm
- Height levels: up to 100 (max height ∼2.6 μm)
- Transmission/reflection: >96% (VIS), ∼85% (NIR)
- Crosstalk (SSIM): <0.1 (spectral), 0.17–0.55 (multi-plane)
Computational: Neural Volume and Image Synthesis
Software-based MPLI (a superset including classical MPI) leverages deep convolutional or transformer architectures to build and refine rich layered scene representations:
- Network predicts per-plane features: color, density, irradiance, or attention masks
- Adaptive depth positioning and inter-plane interactions (e.g., via self-attention and masking (Han et al., 2022))
- Differentiable volume rendering or compositing for novel view/relighting synthesis
- Support for real and synthetic data, arbitrary viewpoint, and even time/lighting variation (Temporal-MPI (Xing et al., 2021), RelightMaster (Bian et al., 9 Nov 2025))
In some formulations, the MPLI encodes lighting cues as a stack of 2D images with precise registration to the content stream, permitting plug-and-play “visual prompts” for advanced diffusion models.
3. Representative Implementations and Applications
| System / Context | Plane Content | Use Case |
|---|---|---|
| BDOE (Meem et al., 2019) | Phase relief (height) | Volumetric/spectral projection |
| RelightMaster (Bian et al., 9 Nov 2025) | Plane irradiance | Video relighting |
| Temporal-MPI (Xing et al., 2021) | Basis MPIs, per-frame | Dynamic scene synthesis |
| Cross-MPI (Zhou et al., 2020) | Plane-aware attention | Super-resolution stereo |
| MMPI (He et al., 2023) | Multiple MPIs + blending | 360°, robust NeRF |
| Learned optical multiplexing (Cheng et al., 2019) | Multiplexed intensity | Multi-focal plane microscopy |
Applications of MPLI span:
- Volumetric AR and depth-layered displays (physical, computational)
- Efficient, crosstalk-minimized holographic and security imaging (multi-band/multi-plane BDOEs)
- Video relighting with explicit spatio-temporal light prompts (RelightMaster, DiT adaptation)
- High-speed, high-fidelity multi-focal plane microscopy (deep learned LED multiplexing)
- Unbounded scene and long-trajectory radiance field compression (MMPI)
- Realistic optical flow, segmentation, and cross-modal synthesis tasks (MPI-Flow, Cross-MPI)
4. Design, Optimization, and Training Strategies
Physical MPLI design involves:
- Jointly optimizing phase profiles or multiplexing weights to match target patterns at multiple depths and/or wavelengths (DBS, iterative gradient methods)
- Manufacturing constraints: avoidance of subwavelength features, discrete height quantization, index-matched coatings for flat optics, tolerance analysis for fabrication errors (standard deviation ≲65 nm yields <20% efficiency loss up to σ≈100 nm (Meem et al., 2019))
Computational MPLI construction employs:
- Architectural innovations for plane- and layer-wise operations (3D CNNs, self-attention, residual masking)
- Adaptive or data-driven allocation of depth positions and plane content (scene-specific adjustment modules, learned attention masks)
- Plane- or basis-parameterized lighting or appearance bases for temporal, lighting, or view consistency (Temporal-MPI, MPIs with lighting coefficients)
- Data generation: synthetic “warp-back” pipelines for in-the-wild view synthesis, or closed-loop simulation/data for physical hardware (microscopy, optics)
Optimization targets include direct reconstruction loss, perceptual or frequency-consistency, crosstalk minimization, and efficiency metrics. In physical MPLI, figures of merit—such as imaging efficiency, transmission, and structural-similarity—quantify optical performance (Meem et al., 2019). In learned MPLI, PSNR, SSIM, LPIPS, or custom rendering losses are used (Bian et al., 9 Nov 2025, Han et al., 2022).
5. Performance Characteristics and Limitations
Physical MPLI devices (multi-band BDOEs) achieve:
- Measured imaging efficiency: 54% (multi-plane, 400–700 nm, experimental), 60–70% (single-plane)
- Absolute throughput: >96% (visible), ~85% (NIR)
- Multi-plane crosstalk: SSIM <0.2 (sim), up to 0.55 (exp.; minimized by smaller pitches, larger plane separation)
- Scalable, mass-producible, and robust to fabrication error
Learned/algorithmic MPLI systems achieve:
- Real-time or tens-of-milliseconds-level synthesis with compact model sizes (Temporal-MPI: 0.008 s/frame, 0.48 GB vs. 5.4 GB for 3DMaskVol21 (Xing et al., 2021))
- State-of-the-art or superior visual quality on multiple benchmarks (e.g., PSNR 34.2, SSIM 0.966 on Ken Burns (Han et al., 2022))
- Near-seamless generalization to arbitrary lighting, time, or view distributions (RelightMaster multi-source, temporal control (Bian et al., 9 Nov 2025))
- Limitations: explicit view-dependent effects still require extensions (spherical harmonic radiance, learned per-layer lighting), thin/sloped geometries may need more layers or adaptive merging (Solovev et al., 2022)
6. Extensions, Generalizations, and Research Outlook
Recent research broadens MPLI beyond fixed-RGBA stacking:
- Spherical-harmonic or per-plane lighting profiles for fine relighting and photorealism
- Adaptive, attention-driven depth allocation for efficient plane utilization (Han et al., 2022)
- Multiple, directionally-oriented MPIs (MMPI) and adaptive per-voxel blending for robust 360° or long-range scenes (He et al., 2023)
- Physical extensions: multi-level, multi-band, and reflective elements, as well as programmable optical circuit implementations (e.g., multi-plane light conversion for spatial mode sorting (Kupianskyi et al., 2022))
A plausible implication is that further hybridization of physical and neural MPLI systems—integrating optical encoding and learned decoding—will unlock real-time, multi-modal, and dynamically controllable imaging with broad impact across sensing, AR/VR, and scientific imaging. The modularity and adaptability of the multi-plane paradigm position it as a central tool in high-fidelity computational photography, efficient scene relighting, and high-dimensional optical systems.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free