GPU-Accelerated Monte Carlo Mie Scattering

Updated 3 August 2025

The paper presents a GPU-accelerated Monte Carlo model that incorporates rigorous Mie theory to accurately simulate radiative scattering, outperforming traditional Henyey–Greenstein methods.
It leverages CUDA-enabled parallelism with precomputed CDFs for scattering angle sampling, achieving orders-of-magnitude reduction in computational cost.
Experimental validation shows improved RMSE and correlation coefficients, confirming enhanced resolution of angular features in applications like biomedical imaging and atmospheric sensing.

A GPU-Accelerated Monte Carlo-Rigorous Mie Scattering Transport Model enables high-fidelity, computationally efficient simulation of radiative transfer in complex scattering environments by leveraging CUDA-enabled parallelism and explicit Mie theory. This approach directly addresses limitations of the Henyey–Greenstein approximation in modeling phase functions and is validated against controlled experiments with monodisperse microsphere suspensions, achieving superior accuracy in both absolute intensity and fine angular features, while realizing orders-of-magnitude reductions in computational cost compared to CPU-based models (Wang et al., 31 Jul 2025).

1. GPU-Accelerated Monte Carlo Methodology

The core strategy utilizes a classic photon packet Monte Carlo transport model, in which each packet undergoes independent simulated propagation, scattering, and detection steps. GPU acceleration is achieved through:

CUDA Thread Mapping: Each thread simulates an independent photon, exploiting “embarrassingly parallel” structure.
Photon State Handling: Arrays representing photon position, direction, weight, and accumulated detector contributions are allocated in global GPU memory. Access patterns are optimized for coalesced reads and writes. Accumulation of light intensity relies on GPU atomic operations for thread safety.
Pre-computed CDF for Scattering Angle Sampling: For rigorous Mie phase functions, the angular probability density is sampled not through runtime integral inversion, but via binary search in a precomputed lookup table containing cumulative distribution values (at ~3000 angular points). This yields efficient, inversion-free scattering, even for highly oscillatory Mie functions.
Simulation Throughput: Reported throughput is 4.56 million photons/s with GPU acceleration, enabling simulations otherwise prohibitive for full Mie physics at high optical depth.

This architecture delivers speedups of several hundred times over CPU implementations, directly supporting fine-grained phase function features and multi-order multiple scattering regimes (Wang et al., 31 Jul 2025).

2. Rigorous Mie Theory Implementation

The model departs from the traditional Henyey–Greenstein (H–G) approximation by explicitly implementing Mie theory:

Mie Amplitude Functions: The scattering amplitudes $S_1(\theta)$ and $S_2(\theta)$ are evaluated as series over Riccati–Bessel functions and Mie coefficients $a_n$ , $b_n$ (dependent on complex refractive index $m$ and size parameter $x$ ), following canonical Mie formulas (e.g., equations (5)-(8) (Wang et al., 31 Jul 2025)).
Angular Phase Function: The differential phase function is evaluated as $P(\cos{\theta}) = \frac{1}{k^2}\left(|S_1(\theta)|^2 + |S_2(\theta)|^2\right)$ , with $k = (2\pi/\lambda)m$ . This captures the sharp forward peaks, oscillatory structure, and backscattering lobes intrinsic to real microscale scatterers, unlike the single-parameter $g$ of H–G.
Polarization Effects: The model enables (as in related works) the calculation of the full Mueller matrix, supporting both intensity and the Stokes vector. For polarized light, field transformations use Jones and subsequently Mueller matrix calculus, yielding spatially resolved Stokes maps and the degree of polarization (DOP) (Heller et al., 31 Dec 2024).
Normalization: The phase function is numerically integrated over $4\pi$ solid angle and renormalized for conservation of scattered energy.

Significance: This explicit Mie implementation is essential for accurate simulations in media with high size parameter contrast, multiple interfering scattering orders, or polarization-sensitive transport, regimes where H–G is demonstrably insufficient.

3. Quantitative Performance and Model Validation

Experimental validation was conducted using a physical scattering platform:

Sample System: Monodisperse 5 μm polystyrene microspheres suspended in deionized water within a standard quartz cuvette.
Optical Setup: Collimated 532 nm laser, 90° lateral detection, spatial registration techniques for precise alignment between simulation grid and experimental imagery. Optical depth (OD) varied from 5 to 12.5 via Beer–Lambert law-controlled dilution.
Quantitative Metrics: Simulated and measured lateral scattering distributions were compared via RMSE and correlation coefficient ( $r$ ). The rigorous Mie model achieved an average 3.62% lower RMSE and a 6.33% higher $r$ over H–G. The radial error was systematically lower, especially in both intermediate and edge detector regions.
Angular Resolution: Oscillatory intensity features and lateral backscattering are accurately reproduced in simulation only with Mie, not H–G, especially at high OD (complex multiple scattering).

Table: Comparison of Phase Functions and Simulation Accuracy

Model	RMSE Improvement	Correlation Coefficient (r)	Fine Feature Accuracy
Henyey–Greenstein (H–G)	Baseline	Baseline	Misses oscillatory details
Rigorous Mie (GPU-MC)	+3.62%	+6.33%	Captures oscillatory/peaks

4. Algorithmic Comparisons and Optimizations

Key differences and optimizations in the rigorous GPU-Mie approach versus approximated or CPU-based models include:

Phase Function Sampling: The H–G phase function permits analytical inversion for angle sampling. Mie-based phase functions require numerical CDF inversion, efficiently solved here via lookup and binary search.
Multiple Scattering and High Optical Depth: Standard MC codes with H–G underperform in angular intensity features and backscattering at high OD; the presented model maintains fidelity in these regions, matching experimental data.
Memory Access Patterns: For scalability, global memory arrays are organized for coalesced access. Thread divergence is minimized by harmonizing photon lifetimes and path sampling logic.
Spatial Registration: GPU-accelerated transformations ensure that simulation outputs align with experimental detection geometry, permitting pixelwise quantitative comparisons.

5. Applications and Implications

The GPU-accelerated MC-Mie model has been demonstrated, or is directly applicable, in:

Biomedical Imaging: Enhanced modeling of tissue scattering for image reconstruction, parameter inversion, and diagnostic contrast (especially when subwavelength or mesoscale structures dominate angular scattering) (Heller et al., 31 Dec 2024).
Environmental and Atmospheric Sensing: Improved prediction of aerosol and particulate scattering signatures in remote sensing, including nontrivial angular, polarization, or high-OD effects.
Industrial Process and Ocean Optics: Accurate light transport modeling in turbid or multiphase fluids where particle size and composition distributions are variable.
General Optical Measurement Correction: Provides a reliable baseline for removing systematic errors from instrument or analytic models that assume H–G-like simplifications.

This suggests that broad classes of high-precision optical systems with complex scattering regimes stand to benefit from the explicit, parallelized Mie formalism.

6. Limitations, Perspectives, and Future Directions

Memory and Preprocessing: The explicit Mie approach introduces significant storage requirements for high-resolution CDFs and phase function tables, particularly when polarization and multiple size distributions are modeled.
Scaling: While current implementation demonstrates 4.56 million photons/s processing and leverages thread-level fine-grained parallelism, future hardware with larger global memory and multi-GPU systems could push model fidelity further.
Model Scope: The approach currently excels for spherical and monodisperse scatterers as described by Mie theory. Non-spherical scattering and coupled particle effects are beyond the scope of the present model.
Advanced Acceleration: Techniques such as hybrid variance reduction, intelligent pathfinding (as suggested for radiative transfer simulations (Krieger et al., 2020)), or adaptive mesh refinement may further improve efficiency or broaden model applicability.
Experimental Validation: The close agreement obtained in controlled microsphere suspensions underscores the importance of system-specific validation in realistic application domains.

7. Summary

The GPU-Accelerated Monte Carlo-Rigorous Mie Scattering Transport Model represents a significant advance in radiative transfer simulation for complex scattering environments. By explicitly implementing full Mie theory in a massively parallel architecture and validating against quantitative experiments, the model overcomes the limitations of traditional single-parameter approximations, provides detailed angular and spatial accuracy, and achieves computational performance compatible with practical high-fidelity optical modeling requirements (Wang et al., 31 Jul 2025). The rigorous phase function implementation, combined with efficient GPU-based integration and experimental benchmarking, positions this framework as a reference for future research and application in scattering-dominated optical systems.

PDF Markdown Chat (Upgrade)

References (3)

GPU-Accelerated Monte Carlo Simulation and Experimental Study of Radiative Transfer in Multiple Scattering Media (2025)

Mie scattering due to tissue structures in the terahertz regime: Experimental and Monte Carlo verification using diffused polarimetric imaging in highly attenuating tissue phantoms (2024)

The scattering order problem in Monte Carlo radiative transfer (2020)