Raymap Representation in Computational Imaging

Updated 2 April 2026

Raymap Representation is a data structure that maps input rays—defined by source position and direction—to outputs like surface intersections and optical properties, serving as a foundational abstraction in computational imaging and rendering.
It employs mathematical functions and neural architectures, such as multilayer perceptrons with careful parameter normalization and loss balancing, to approximate complex ray transformations with high precision and computational efficiency.
Raymaps are pivotal in diverse applications including 3D scene reconstruction, probabilistic volumetric mapping, and optical system modeling, offering scalable, interpretable, and robust methods for addressing complex ray phenomena.

A raymap representation is a general term for any data structure, function, or neural architecture that expresses the mapping from a bundle of input rays (often parameterized by source position and direction) to associated outputs—such as transformed directions, surface intersections, optical characteristics, or probabilistic variables—across various applications in computational imaging, 3D reconstruction, computer vision, optics, and rendering. Raymaps serve as foundational abstractions for both classical deterministic ray tracing and modern, learned or probabilistic methods, often replacing or augmenting explicit geometric scene representations.

1. Mathematical and Algorithmic Forms of Raymaps

Raymap representations formalize ray behavior under different system models. The canonical mathematical form is a function

$f : (\mathbf{p}_{\mathrm{in}}, \mathbf{d}_{\mathrm{in}}) \mapsto (\mathbf{p}_{\mathrm{out}}, \mathbf{d}_{\mathrm{out}})$

where $\mathbf{p}_{\mathrm{in}}$ and $\mathbf{d}_{\mathrm{in}}$ parameterize incident rays (source plane position and direction), and $\mathbf{p}_{\mathrm{out}}, \mathbf{d}_{\mathrm{out}}$ describe the corresponding output ray after propagating through an optical or geometric system (Sinaei et al., 28 Jul 2025).

For neural ray fields, such as Ray2Ray, the raymap is an implicit function approximated by a fully-connected multilayer perceptron (MLP) mapping 4D normalized input vectors to 4D outputs (positions and directions), with training objectives defined via physically-grounded loss terms mixing positional ( $\mu$ m-scale) and angular ( $0.01^\circ$ -level) discrepancies (Sinaei et al., 28 Jul 2025).

Other domains generalize the concept:

In 3D reconstruction (e.g., Rig3R, RayMap3R), the raymap is a dense field of per-pixel world-space ray origins and directions, $\mathbb{R}^{H\times W\times 6}$ , directly encoding per-frame camera geometry and enabling pose, structure, or dynamic-scene reasoning (Li et al., 2 Jun 2025, Wang et al., 21 Mar 2026).
In probabilistic volumetric mapping (MRFMap), the raymap is a set of ray-linked Markov Random Fields in which the measurement probability, forward models, and occlusion constraints are encoded per-ray via structured factors (Shankar et al., 2020).
For freeform optics, the “ray mapping” is an explicit mapping $u(x,y)$ between input and output pupil coordinates, constructed to satisfy both energy conservation and integrability (surface continuity) conditions (Bösel et al., 2015).

2. Input Parameterizations, Network Architectures, and Training Regimes

Raymap implementations depend critically on the choice of input parameterization and the learning or approximation machinery:

Parameter normalizations: Ray2Ray normalizes each source and direction coordinate to $[-1,1]$ relative to the aperture, with no positional encoding (Sinaei et al., 28 Jul 2025). Rig3R and RayMap3R define each pixel's ray by projecting through camera intrinsics and extrinsics, yielding dense fields of direction and camera origin (Li et al., 2 Jun 2025, Wang et al., 21 Mar 2026).
Neural architectures: Ray2Ray demonstrates that an 8-layer, width-256 MLP with periodic skip connections outperforms larger or smaller variants in accuracy-throughput tradeoff. PRIF and MARF, in shape representation, utilize deeper MLP backbones for higher expressivity (Sinaei et al., 28 Jul 2025, Feng et al., 2022, Sundt et al., 2023).
Training loss construction: Optimization objectives balance geometric fidelity (endpoints, angles), physical constraints (direction unit normalization), and, when appropriate, learned noise models or probabilistic targets. For example, Ray2Ray incorporates a weighted sum of position and angle errors, empirically setting $\lambda \approx 100$ for gradient balancing (Sinaei et al., 28 Jul 2025). MRFMap defines factor graph log-likelihoods based on sensor measurements and learned ray-by-ray noise models (Shankar et al., 2020).
Data and supervision: Supervised datasets are constructed through classical raytracing, simulation, or measured data. For instance, Ray2Ray uses a grid of input positions and Monte-Carlo sampled directions through commercial optical systems, with up to $\mathbf{p}_{\mathrm{in}}$ 0 rays per evaluation set (Sinaei et al., 28 Jul 2025).

3. Representation Variants and Physical Interpretability

Raymap representations span a spectrum from physical, geometric, and analytic models to fully learned or statistical abstractions:

Deterministic ray transform models: Ray2Ray substitutes sequential geometric optics surface tracing with a global neural mapping, yielding direct physical interpretability per ray pair while achieving $\mathbf{p}_{\mathrm{in}}$ 11000 $\mathbf{p}_{\mathrm{in}}$ 2 speedup in throughput (Sinaei et al., 28 Jul 2025). Classical freeform ray mapping applies optimal transport to compute physically integrable $\mathbf{p}_{\mathrm{in}}$ 3 mappings, followed by PDE-based surface recovery (Bösel et al., 2015).
Probabilistic and statistical fields: MRFMap employs a global factor graph in which each ray/voxel factor encodes measurement likelihood and occlusion, capturing uncertainties and dependencies lost in independent occupancy-grid updates (Shankar et al., 2020). Analysis over rays, not voxels, enables more accurate mapping with explicit reasoning about occlusion and noise.
Learned scene structure: In Rig3R, the raymap is foundational for decomposing frames into pose raymaps and rig-relative raymaps, the latter supporting unsupervised rig structure discovery when synchronization metadata is unavailable (Li et al., 2 Jun 2025).

Physically enforceable constraints include:

Output direction normalization,
Implicit or explicit adherence to Snell's law or energy conservation (enforced empirically through supervision),
Reconstruction of amplitude and phase (wavefront curvature and divergence) in advanced wave-optical frameworks (Ren et al., 2024).

4. Practical and Computational Implications

Raymap-based pipelines provide significant efficiency and scalability gains as compared to surface-by-surface or sample-by-sample approaches:

Acceleration: Ray2Ray achieves $\mathbf{p}_{\mathrm{in}}$ 4 mean positional error and $\mathbf{p}_{\mathrm{in}}$ 5 mean angular deviation in commercial optical assemblies, while running up to $\mathbf{p}_{\mathrm{in}}$ 6 faster than conventional fine-grained tracing (Sinaei et al., 28 Jul 2025).
Representation compression: Raymaps in X-Ray, PRIF, and MARF condense dense 3D or surface geometry into concise, ray-parameterized forms, supporting generative pipelines, mesh extraction, and differentiable rendering with orders-of-magnitude fewer queries (Hu et al., 2024, Feng et al., 2022, Sundt et al., 2023).
Algorithmic robustness: RayMap3R integrates raymap tokens into a streaming transformer state, supporting temporally consistent 3D mapping and robust distinction between static and dynamic scene content—even under lack of per-frame training—through static-scene biased inference and dual-branch gating (Wang et al., 21 Mar 2026).
Expressive applications: In neural radiance/reflectance textures, bucket-structured “raymaps” allow real-time view-dependent rendering with physically accurate optical effects pre-baked into per-texel directional samples, removing the need for runtime computation of complex phenomena (Fober, 2023).

5. Applications Across Optical, Computational, and Graphics Domains

Raymap representations underpin a wide range of contemporary research directions:

Optical system modeling: Proxy optical raytracing with neural raymaps for lens assemblies and other systems, mapping input ray grids to physical outputs directly (Sinaei et al., 28 Jul 2025).
3D scene reconstruction and vision: Streaming or batch 3D reconstruction utilizing per-view raymaps for structure-from-motion, egomotion, and dynamic scene parsing tasks (Li et al., 2 Jun 2025, Wang et al., 21 Mar 2026).
Probabilistic mapping and sensor data fusion: Ray-based volumetric occupancy estimation with explicit noise modeling and occlusion reasoning (MRFMap) (Shankar et al., 2020).
Generative 3D geometry and shape analysis: PRIF and MARF frameworks directly regress surface hitpoints or medial atom decomposition from rays, supporting single- and multi-object encoding, generative modeling, pose estimation, and robust mesh extraction (Feng et al., 2022, Sundt et al., 2023).
Optical design and freeform surface computation: Optimal transport-based ray mapping solves the inverse problem of shaping optical surfaces for prescribed intensity distributions, ensuring physical integrability and manufacturability (Bösel et al., 2015).
Neural rendering and reflectance: Real-time, physically plausible reflection, refraction, and subsurface scattering using radiance-bucket raymaps as GPU rasterization textures (Fober, 2023).

6. Limitations, Open Questions, and Future Directions

Despite widespread adoption, limitations and research challenges persist:

Generalization: Neural raymaps trained on specific devices or scene classes (e.g., Ray2Ray) exhibit performance drops when exposed to novel optical configurations or geometric/topological domain shifts, with accuracy losses up to 2–3 $\mathbf{p}_{\mathrm{in}}$ 7 unless fine-tuned (Sinaei et al., 28 Jul 2025).
Physical constraints: Learned raymap models generally do not encode explicit Snell-law or energy conservation layers, relying on supervision to enforce physicality. In edge or high-angle regimes, errors may increase due to training sample scarcity.
Expressivity and capacity: The theoretical limits of what can be encoded by compact raymap neural approximators remains understudied—particularly in the context of challenging, high-frequency geometric surfaces, or with anisotropic/heterogeneous media (Feng et al., 2022).
Dynamic reasoning: Separating static from dynamic content in real-time streaming 3D reconstruction requires additional inference machinery, such as RayMap3R’s dual-branch discrepancy gating and temporally smoothed state tokens (Wang et al., 21 Mar 2026).

A plausible implication is that future work will need to combine data-driven raymap architectures with stronger physical priors, hierarchical adaptive sampling, or learned uncertainty quantification to further improve fidelity and robustness in real-world settings, including robotics, photonics, and omnidirectional 3D capture.

7. Summary Table: Representative Raymap Approaches

Raymap Variant	Mathematical Form/Structure	Key Reference(s)
Neural optical system mapping	$\mathbf{p}_{\mathrm{in}}$ 8 MLP on (pos, dir)	(Sinaei et al., 28 Jul 2025)
Rig/pose raymaps for vision	Per-pixel field: $\mathbf{p}_{\mathrm{in}}$ 9	(Li et al., 2 Jun 2025, Wang et al., 21 Mar 2026)
Probabilistic occupancy (MRFMap)	MRF over ray/voxel factors	(Shankar et al., 2020)
Freeform surface ray mapping	OMT-based $\mathbf{d}_{\mathrm{in}}$ 0 + advection	(Bösel et al., 2015)
Medial atom (MARF)	Ray $\mathbf{d}_{\mathrm{in}}$ 1 set of spheres	(Sundt et al., 2023)
Primary Ray-based Implicit (PRIF)	Ray $\mathbf{d}_{\mathrm{in}}$ 2 surface hit (MLP)	(Feng et al., 2022)
Sequential X-Ray representation	Ray $\mathbf{d}_{\mathrm{in}}$ 3 [hit/normal/color] frames	(Hu et al., 2024)

These approaches exemplify the diversity and centrality of raymap representations across physical simulation, 3D computational vision, neural rendering, and optical system design.