- The paper introduces a multiscale inference scheme that leverages exponential context scaling to reduce error accumulation in long-range predictions.
- It outperforms autoregressive baselines by sharply lowering Wasserstein distances and improving stability in solar magnetogram predictions.
- The approach extends probabilistic forecasting to complex systems with long-memory, with potential applications across various dynamical environments.
Multiscale Inference for Probabilistic Prediction in Partially Observable Dynamical Systems
Overview and Motivation
This paper addresses a critical limitation in conditional diffusion modeling for dynamical systems, namely prediction under partial observability and long-memory temporal dependencies. Conventional autoregressive schemes, while effective for fully observable or well-assimilated systems, fail to efficiently leverage information from distant past states, resulting in instability and accumulation of error in long-range rollouts. The authors introduce a multiscale inference scheme in the context of conditional diffusion models, explicitly designed for physical processes governed by hidden, long-range dynamics. The primary application, and a challenging benchmark, is solar physics—where observable regions only capture surface and atmospheric dynamics, while key drivers of evolution reside in the unobserved solar interior.
Standard autoregressive conditional diffusion rollouts predict future frames by sequentially sliding a fixed-length context window over time, commonly conditioning on the most recent few steps. As shown, this approach ignores distal past information once the initial iterations are complete, promoting error propagation and distribution bias, especially across long horizons. The proposed multiscale inference scheme introduces "multiscale templates" that symmetrically span fine-grained time steps near the present and exponentially coarser intervals farther away, both in the past and the future. By conditioning predictions on a sparse set of steps sampled at multiple scales, the scheme exploits the long-memory characteristics typical of physical dynamical systems.
Figure 1: Multiscale templates and inference scheme—contrasting standard autoregressive rollouts with the proposed multiscale approach, which captures longer-range dependencies while maintaining computational efficiency.
The templates are constructed such that, for K conditioning frames, the horizon covered grows exponentially with K instead of linearly, allowing the model to efficiently utilize information from the distant past. At each inference step, a fixed-length conditional diffusion model predicts several future frames while conditioning on previously observed or generated frames selected by the multiscale schedule.
Empirical Results: Synthetic Systems and Solar Dynamics
Synthetic Time Series
Initial experiments on synthetic time series, designed with a partially observable sinusoidal trend corrupted by Gaussian noise, highlight the limitations of autoregressive inference. Locally, the underlying trajectory is ambiguous, obscured by stochastic realizations, which the model cannot resolve with local-only context, leading to broad, biased predictions in distant future steps. The multiscale inference scheme, conditioning on temporally distributed context, dramatically reduces this error, as corroborated by a reduction in the Wasserstein distance of predicted distributions ($0.021$ for multiscale vs. $0.23$ for autoregressive).
Figure 2: Multiscale inference yields sharper, less biased probabilistic forecasts than autoregressive schemes on a noisy, partially observable synthetic time series.
Solar Dynamics Prediction
The central benchmark is a newly curated multi-modal dataset comprising high-resolution (512×512) solar region videos (surface and coronal fields), sampled hourly over 48-hour windows (totaling $8.5$TB). Solar activity exhibits pronounced long-memory: autocorrelation analyses verify smooth, slow decay over time across all measured channels, matching theoretical expectations for sub-surface driven processes.
Figure 3: Solar activity cycles and dataset segmentation, providing both train/test splits and coverage of high/low activity intervals for robust evaluation.
Figure 4: Autocorrelation profiles for solar region channels reveal persistent, long-range dependencies—the motivating signal characteristic for multiscale inference.
Extensive quantitative evaluations compare the proposed scheme against autoregressive and hierarchy-based baselines (including FDM of Harvey et al., AViT, and AR-diffusion architectures). The multiscale inference reduces rollout bias and predictive instability, securing lower Wasserstein distances and mean absolute errors in power-spectrum alignment over all tested horizons (1-4, 4-16, 16-32 hours). Additionally, the method exhibits superior preservation of solar-physics summary statistics (e.g., unsigned flux, mean horizontal gradient of total/vertical field), indicating greater physical plausibility in generated rollouts.

Figure 5: Visual comparison of predicted solar magnetogram rollouts for multiple inference schemes—multiscale predictions match observed trajectories with improved stability over long horizons.
Theoretical and Practical Implications
The multiscale approach leverages foundational insights from long-memory process theory (e.g., wavelet-based analysis and multifractals), extending the utility of conditional diffusion models into domains where the Markov assumption does not hold for observed states. The exponential scaling in context horizon, for fixed computational cost, allows models to incorporate physically relevant historical context that would be out of reach for standard autoregressive protocols.
Practically, this unlocks more reliable probabilistic forecasting in "hidden Markov" environments, where underlying state variables (e.g., solar interior flows, deep ocean layers) are unmeasured. The framework is shown to generalize beyond solar dynamics, demonstrating improved prediction in partially observable synthetic fluid systems as well.



Figure 6: Scaling of prediction horizon with inference scheme type—multiscale templates transitively increase horizon for stable long-term rollouts.
Speculation and Future Directions
Adaptive conditioning: While the fixed multiscale template is tuned to long-memory, slow decorrelation processes, environments characterized by short-term fluctuations may benefit from schemes that adapt context selection based on estimated autocorrelation structure or learned attention over time. Models combining dynamic template selection or attention mechanisms atop the multiscale backbone merit investigation.
Extension across domains: The framework is directly applicable to any forecasting domain exhibiting partial observability and slowly decaying temporal correlations—climate, geomagnetic, or even financial systems.
Improved architectures: Synergies between multiscale inference schemes and advanced denoising architectures (e.g., multi-scale Next-DiT, latent-space transformers, or kv-caching models) remain an underexplored frontier for further performance gains.
Conclusion
The multiscale inference scheme for diffusion models, as presented, resolves intrinsic instability and error growth in probabilistic forecasting of partially observable dynamical systems with long-memory. The approach achieves quantifiable improvements in stability, distributional fidelity, and physically meaningful forecasts on demanding benchmarks including high-resolution solar activity. By integrating temporal context from both proximal and distant past, the framework extends the reach and reliability of conditional generative models into realistic scenarios encountered across physical sciences and engineering.