Papers
Topics
Authors
Recent
2000 character limit reached

FieldSeer I: Geometry-Aware Field Forecasting

Updated 12 December 2025
  • FieldSeer I is a geometry-aware world model that forecasts long-horizon electromagnetic field evolution in 2-D TE waveguides using a symlog transformation for numerical stability.
  • It integrates geometry-conditioned tokenization, masked Transformer planning, and a GRU dynamics core to assimilate partial observations and generate high-fidelity predictions.
  • The model supports immediate, editable rollouts of photonic digital twins, enabling interactive design adjustments without re-assimilation or retraining.

FieldSeer I is a geometry-aware world model designed for long-horizon forecasting of electromagnetic field dynamics in two-dimensional transverse-electric (TE) waveguides under partial observability. The model assimilates a short prefix of observed electromagnetic fields, conditions on a user-specified scalar source action and a structure/material map, and generates closed-loop rollouts of future field evolution in the physical domain. Training in a symmetric-logarithmic (symlog) domain ensures numerical stability against multi-scale field amplitudes. FieldSeer I achieves substantially higher suffix fidelity than GRU and deterministic Transformer-based baselines in a fully reproducible finite-difference time-domain (FDTD) benchmarking protocol. Notably, FieldSeer I enables immediate geometry-editable rollouts, facilitating interactive photonic digital twin applications without re-assimilation or retraining (Guo et al., 5 Dec 2025).

1. Physical Background and Problem Formalization

FieldSeer I targets predictive modeling for 2-D TE waveguides, where the nonzero electromagnetic field components are EzE_z, HxH_x, and HyH_y. These are discretized on a uniform Yee grid, where the semi-discrete Maxwell–Faraday and Maxwell–Ampère equations in vacuum cells are:

Ezt=(×H)z,Hxt=(×E)x,Hyt=(×E)y\frac{\partial E_z}{\partial t} = (\nabla \times H)_z, \qquad \frac{\partial H_x}{\partial t} = -(\nabla \times E)_x, \qquad \frac{\partial H_y}{\partial t} = -(\nabla \times E)_y

Dispersive inclusions are modeled using Lorentz dielectric response,

ε(ω)=ε+ωp2ω02ω2iγω\varepsilon(\omega) = \varepsilon_{\infty} + \frac{\omega_p^2}{\omega_0^2 - \omega^2 - i\gamma\omega}

Polarization currents are updated using consistent FDTD time-stepping. A band-limited Gaussian pulse of amplitude ata_t is injected at each timestep via a total-field/scattered-field (TFSF) boundary. Perfectly matched layers (PML) and Mur updates provide non-reflecting boundaries to avoid wrap-around artifacts. Upstream and downstream 1-D probes record EzE_z and HyH_y; these measurements enable computation of the Poynting flux Sx=EzHyS_x = E_z H_y.

Partial observability is formalized as access to:

  • Prefix of PP observed frames: E1,,EPE_1, \ldots, E_P
  • Suffix of QQ ground-truth future frames: EP+1,,EP+QE_{P+1}, \ldots, E_{P+Q}
  • Source action stream: {at}t=1P+Q\{a_t\}_{t=1}^{P+Q}
  • Structure/material map: STR{0,1}W×H×C\mathrm{STR} \in \{0,1\}^{W \times H \times C}

At inference, only the prefix, full structure map, and action stream are provided. The model must first assimilate the prefix and then roll out QQ frames in closed-loop, optionally under a modified geometry.

2. FieldSeer I Architecture and Training

FieldSeer I follows a physics-guided world-model paradigm, consisting of the following components:

  • Symlog Field Representation: All physical field data are mapped to the symmetric log domain using

symlog(x)=sign(x)log(1+x),symexp(y)=sign(y)(exp(y)1)\mathrm{symlog}(x) = \mathrm{sign}(x) \cdot \log(1+|x|), \qquad \mathrm{symexp}(y) = \mathrm{sign}(y) \cdot (\exp(|y|) - 1)

to stabilize training across large dynamic ranges.

  • Geometry-Aware Tokenization and Transformer-derived Planning Context: At timestep tt,
    • Observed feature: et=fobs(symlog(Et))RDe_t = f_{obs}(\mathrm{symlog}(E_t)) \in \mathbb{R}^D
    • Action embedding: ut=WaatRDu_t = W_a a_t \in \mathbb{R}^D
    • Structure embedding: s=fstr(STR)RDs = f_{str}(\mathrm{STR}) \in \mathbb{R}^D
    • These form observation/action tokens: Ot=et+WessO_t = e_t + W_{es} s, At=ut+WassA_t = u_t + W_{as} s, interleaved into a causal sequence and encoded via a masked Transformer. Plan contexts ct=h2t1c_t = h_{2t-1} are extracted from the hidden states of action tokens.
  • GRU Dynamics Core with Structural Drive: The GRU aggregates plan context, the previous latent, optional feedback (pixels or predictions), and a structural drive:

rt=ct+Wzzt1+Weet1+Wss,ht=GRU(ht1,rt)r_t = c_t + W_z z_{t-1} + W_e e_{t-1} + W_s s, \qquad h_t = \mathrm{GRU}(h_{t-1}, r_t)

Layer normalization is applied to hth_t. Feedback alternates between ground-truth observation during prefix (tPt \leq P) and model predictions after.

  • Stochastic Latent for Uncertainty Modeling:

ztz_t is sampled from q(ztht,Et)q(z_t|h_t,E_t) during prefix, and from prior p(ztht)p(z_t|h_t) during rollout:

q(zt)=N(μq([ht;vec(Et)]),σq2(ht)),p(ztht)=N(μp(ht),σp2(ht))q(z_t|\cdot) = \mathcal{N}(\mu_q([h_t; \text{vec}(E_t)]), \sigma_q^2(h_t)),\quad p(z_t|h_t) = \mathcal{N}(\mu_p(h_t), \sigma_p^2(h_t))

  • Structure-conditioned Decoder: Outputs symlog field prediction as y^t=Decoder([ht;zt],STR)\hat{y}_t = \mathrm{Decoder}([h_t;z_t], \mathrm{STR}) mapped back with symexp\mathrm{symexp}.

Training Objectives

Training minimizes a weighted sum of reconstruction, open-loop prediction, FFT spectral auxiliary, and KL divergence losses:

  • Prefix reconstruction: Lrec=t=1Py^tsymlog(Et)2L_{rec} = \sum_{t=1}^{P} \|\hat{y}_t - \mathrm{symlog}(E_t)\|^2
  • Open-loop prediction: Lpred=t=P+1P+Qy^tsymlog(Et)2L_{pred} = \sum_{t=P+1}^{P+Q} \|\hat{y}_t - \mathrm{symlog}(E_t)\|^2
  • FFT spectral auxiliary: Lspec=t=P+1P+QFFT2(y^t)FFT2(symlog(Et))2L_{spec} = \sum_{t=P+1}^{P+Q} \|\mathrm{FFT2}(\hat{y}_t) - \mathrm{FFT2}(\mathrm{symlog}(E_t))\|^2
  • Dreamer-style KL: LKL=t=1P[KL(qtpt)]εL_{KL} = \sum_{t=1}^{P} [\mathrm{KL}(q_t\|p_t)]_\varepsilon
  • Total loss: L=αrecLrec+αpredLpred+αspecLspec+αKLLKLL = \alpha_{rec}L_{rec} + \alpha_{pred}L_{pred} + \alpha_{spec}L_{spec} + \alpha_{KL}L_{KL}

3. Benchmarking Protocol and Evaluation Metrics

FieldSeer I is benchmarked on a dataset of 200 unique FDTD simulations under strict structure-wise splits (180 train, 10 validation, 10 test). Both 64×64 and 80×140 grid resolutions are evaluated. Key evaluation scenarios are:

  • Software-in-the-loop (SITL) Filtering: (64×64, P=80, Q=80)
  • Offline Single-File Rollouts: (80×140, P=240, Q=40)
  • Offline Multi-Structure Rollouts: (80×140, P=180, Q=100)

Metrics for suffix fidelity are computed after applying symexp to model predictions:

  • One-step-ahead MSE and PSNR for SITL:

MSESITL=1QHWt=PP+Q1xt+1y^t+12\mathrm{MSE}_{SITL} = \frac{1}{QHW}\sum_{t=P}^{P+Q-1} \|x_{t+1} - \hat{y}_{t+1}\|^2

PSNRSITL=10log10(MAX2MSESITL)\mathrm{PSNR}_{SITL} = 10\log_{10}\left(\frac{\mathrm{MAX}^2}{\mathrm{MSE}_{SITL}}\right)

where MAX is the 99.9th percentile of E|E| in the validation set.

Baselines include a GRU-only variant (no geometry conditioning or symlog, z-score normalization) and a deterministic Transformer-only prototype.

4. Performance Results and Editable Rollouts

FieldSeer I consistently outperforms both baselines in all settings:

Scenario FieldSeer I PSNR (dB) GRU PSNR (dB) Prototype PSNR (dB)
64×64 SITL (P=80,Q=80) 48.31 35.63 19.32
80×140 offline single-file 39.08 24.01
80×140 offline multi-structure 36.61 20.88

Corresponding MSE values, where provided, show similar trends: for 64×64 SITL, FieldSeer I achieves an MSE of 7.6×1047.6 \times 10^{-4} compared to 1.406×1021.406 \times 10^{-2} (GRU) and 6.01×1016.01 \times 10^{-1} (Prototype). Suffix fidelity gains exceed $30$ dB in the one-step-ahead case (a $13$ dB gain over GRU) and more than $15$ dB in open-loop settings.

The architecture’s continuous injection of the structure map (STR\mathrm{STR}) at all processing stages allows for geometry modifications after prefix assimilation—without re-running the assimilation step. Model rollouts instantly reflect new inclusions in the predicted field evolution.

5. Geometry Conditioning and Interactive Digital Twins

FieldSeer I leverages persistent geometry conditioning at multiple protocol stages: tokenization, GRU drive, and decoding. This architecture uniquely enables real-time, user-driven modifications to the inclusion geometry during inference. If the structure is altered after observing the initial prefix, all subsequent rollouts reflect these changes without requiring recommencement of assimilation or retraining.

This capability underpins interactive, simulation-free digital twins for photonic devices. Engineers may observe electromagnetic field evolution for a short duration, edit device geometry, adjust source amplitudes, and visualize the resulting field changes downstream within milliseconds, eliminating the need for expensive forward FDTD simulations.

6. Implications and Future Directions

FieldSeer I transforms the photonic design workflow by enabling long-horizon prediction, prefix assimilation under partial observability, action conditioning, and immediate geometry editing. This supports dynamic exploration of electromagnetic response in photonic structures, supplanting the traditional simulate-analyze cycle with truly interactive design and digital twin capabilities. Anticipated extensions to full 3-D vector fields, multi-physics couplings, and hardware-in-the-loop adaptation are expected to further integrate learning-based world models into practical, high-fidelity photonic engineering (Guo et al., 5 Dec 2025).

A plausible implication is broad applicability to complex time-dependent PDE systems in which partial observation, intervention, and rapid feedback are central, far beyond photonics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to FieldSeer I.