Diffusion-Based Financial Time Series Generation

Updated 4 July 2026

Diffusion-based generative frameworks for financial time series are probabilistic models that synthesize and forecast data while preserving stylized facts such as fat tails and volatility clustering.
They incorporate finance-specific noising processes, including GBM-based and non-Gaussian methods, to capture multiplicative price dynamics and temporal dependencies.
Hybrid architectures and diverse conditioning mechanisms enable applications in stress testing, risk-neutral derivative pricing, and market simulation with robust empirical validation.

Diffusion-based generative frameworks for financial time series are a class of probabilistic models that learn to synthesize, forecast, complete, denoise, or simulate financial sequences by reversing a prescribed noising or bridge process. In finance, these frameworks are used not only for unconditional sample generation, but also for scenario analysis, stress testing, controllable market simulation, data augmentation, denoising, and derivative pricing. Across recent work, the central design question is not merely how to generate sequences, but how to preserve financial structure: stylized facts such as fat tails, volatility clustering, leverage effects, intraday seasonality, cross-series dependence, stochastic volatility, or risk-neutral martingale constraints (Takahashi et al., 2024, Kim et al., 25 Jul 2025, Tiwari, 21 Mar 2026).

1. Problem setting and domain-specific objectives

In the financial domain, diffusion-based generation is motivated by the observation that realistic synthetic data must reproduce both statistical regularities and temporal dynamics. One line of work states that generating realistic synthetic financial time series is challenging because no model yet satisfies all the stylized facts, including fat tails, volatility clustering, and seasonality patterns (Takahashi et al., 2024). Another line emphasizes that financial time series are difficult because they are non-stationary, heavy-tailed, noisy, and temporally dependent, and that they exhibit stylised facts such as volatility clustering, skewness, and excess kurtosis (Waller et al., 25 Jun 2026).

The targets of these frameworks vary substantially. Some systems are designed for synthetic data generation from prices, volumes, spreads, or macro factors (Takahashi et al., 2024, Guo et al., 4 Sep 2025). Others are explicitly conditional: CoFinDiff conditions on trend and realized volatility to generate trajectories aligned with user-specified market regimes (Tanaka et al., 6 Mar 2025), while the chart-editing framework generates a future candlestick chart from a current chart plus an instruction prompt containing RSI and MACD (Lee et al., 2 Sep 2025). Other work is centered on denoising rather than raw generation, using conditional diffusion to reconstruct cleaner financial series for downstream prediction and trading (Wang et al., 2024). There are also task-specific generative formulations for stress testing (Guo et al., 4 Sep 2025), risk-neutral derivative pricing (Tiwari, 21 Mar 2026), and augmentation for downstream forecasting or hedging (Tanaka et al., 6 Mar 2025, Alouadi et al., 8 Apr 2026).

A recurring theme is that financial usefulness is broader than point prediction. The chart-generation framework explicitly reframes forecasting as image generation rather than predicting a scalar return or classifying a pattern (Lee et al., 2 Sep 2025). The compressed-sensing diffusion framework for macro-financial factors is centered on stress testing and standard scenario analysis rather than exact path reproduction (Guo et al., 4 Sep 2025). The risk-neutral DDPM treats derivative valuation as a measure-adjusted generative sampling problem (Tiwari, 21 Mar 2026). This suggests that diffusion-based finance is best understood as a family of distribution-learning methods whose outputs can be future scenarios, latent factors, denoised signals, or arbitrage-consistent price paths, rather than a single forecasting architecture.

2. Stochastic foundations and finance-specific noising processes

The canonical formulation follows DDPM or score-based diffusion. In the standard DDPM setup, the forward process gradually corrupts a clean sample $x_0$ through Gaussian transitions

$q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$

with reverse denoising learned by a neural network that predicts either noise or the clean sample (Takahashi et al., 2024, Yuan et al., 2024). In score-based form, the reverse-time process uses the score $\nabla \log p_t(x)$ to move a noisy sample toward high-density regions (Wang et al., 2024, Guo et al., 4 Sep 2025). Diffusion-TS departs from standard noise prediction by directly reconstructing $x_0$ at each step and augmenting the objective with a Fourier-domain term (Yuan et al., 2024).

A major finance-specific development is the replacement of generic additive Gaussian corruption with multiplicative or state-dependent processes. The GBM-based framework introduces geometric Brownian motion directly into the forward noising process: $\mathrm{d}s_\tau = \tilde \mu_\tau s_\tau \mathrm{d}\tau + \tilde \sigma_\tau s_\tau \mathrm{d}\tilde w_\tau,$ so that perturbation magnitude scales with price level (Kim et al., 25 Jul 2025). After transforming to log-price space and choosing $\mu_t = \tfrac12 \sigma_t^2$ , the process becomes a variance-exploding SDE,

$\mathrm{d}X_t = \sigma_t \,\mathrm{d}W_t,$

which preserves compatibility with standard score-based reverse modeling while embedding a financial inductive bias about heteroskedasticity (Kim et al., 25 Jul 2025). The more general non-Gaussian framework based on Tweedie’s formula extends this logic to GBM, BESQ, and CIR processes, arguing that additive Gaussian perturbations are structurally restrictive for finance because many meaningful quantities are positive and evolve multiplicatively (Tang et al., 19 May 2026).

A second branch replaces the standard Gaussian endpoint prior rather than the forward diffusion coefficient. TimeBridge uses diffusion bridges and the Schrödinger bridge viewpoint to learn paths between a chosen prior and the data distribution, allowing priors that preserve scale, temporal order, and time-series structure (Park et al., 2024). Its conditional setting treats the condition itself as the prior endpoint, which is especially relevant when one wishes to enforce trend-guided synthesis or imputation constraints without extra guidance penalties (Park et al., 2024).

A third branch generalizes diffusion beyond static-path denoising. AD-Seq argues that standard diffusion on the whole trajectory can violate the information flow of time and proposes a sequential forward-backward diffusion that generates coordinates one at a time, conditioning only on previously generated history (Cao et al., 4 Jun 2026). For financial applications, that distinction is substantive: it targets adapted, non-anticipative sampling rather than a joint static score over the entire path (Cao et al., 4 Jun 2026).

3. Representations, latent spaces, and conditioning interfaces

Financial diffusion models differ sharply in how they represent the data. One representation family converts time series into images so that vision backbones can be reused. The wavelet-DDPM framework transforms synchronized intraday price log returns, bid/ask spreads, and trading volumes into a $16 \times 256 \times 3$ image by applying mirror expansion, Haar wavelets, and RGB channel stacking; generation then occurs in image space and is inverted back to time series (Takahashi et al., 2024). The chart-generation framework turns a 4-hour Bitcoin futures history into a candlestick chart image containing candlesticks, trading volume, SMA5, and SMA90, and fine-tunes Stable Diffusion 1.5 to generate the chart at timestep $n+3$ from the chart at timestep $n$ and a prompt of the form “Predict next candle, RSI is {value}, MACD is {value}” (Lee et al., 2 Sep 2025). For irregular data, a separate two-stage framework first completes the irregular series with a Time Series Transformer, then maps the completed regular series into image space with delay embedding before applying a vision diffusion backbone with masking (Fadlon et al., 8 Oct 2025).

A second family stays in the sequence domain but redesigns the denoiser. Diffusion-TS uses an encoder-decoder transformer with explicit decomposition into trend, seasonal components, and residuals, together with a Fourier loss (Yuan et al., 2024). The GBM-based financial generator adapts the CSDI transformer backbone, adding explicit positional encodings and larger embeddings to better capture stylized facts such as the leverage effect (Kim et al., 25 Jul 2025). TIMED combines a DDPM, masked-attention supervisor network for autoregressive refinement, a Wasserstein critic, and an MMD loss, explicitly arguing that global distributional realism and local temporal coherence must be learned jointly (EskandariNasab et al., 23 Sep 2025). QDiffusion-TS replaces feed-forward sublayers inside the Diffusion-TS denoising transformer with quantum neural networks, producing a hybrid quantum transformer used inside the reverse diffusion process (Waller et al., 25 Jun 2026).

A third family emphasizes latent factor spaces. The compressed-sensing diffusion model trains in a reduced space and reconstructs or interprets the result in the original space (Guo et al., 4 Sep 2025). In the finance application, the exact sketching pipeline is replaced by PCA: 126 FRED-MD macro factors are projected onto the first 6 principal components, which explain over 90% of the variance, and diffusion is trained in that 6-dimensional PC space for stress testing and portfolio scenario generation (Guo et al., 4 Sep 2025). This suggests that when the data exhibit strong low-dimensional structure, latent diffusion may be more natural than full-dimensional path generation.

Conditioning mechanisms are equally diverse. CoFinDiff uses cross-attention to inject trend and realized volatility, defined as

$q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 0

into a conditional diffusion model operating on Haar-wavelet return images (Tanaka et al., 6 Mar 2025). The chart-editing model uses cross-attention in the standard latent diffusion sense to condition on RSI and MACD prompts (Lee et al., 2 Sep 2025). DS-Diffusion avoids retraining for new conditions by extracting trend and seasonal “style” components from real data and inserting them only at inference time through style-guided kernels and time-information based hierarchical denoising (Sun et al., 23 Sep 2025). A different conditioning paradigm appears in the GAN-diffusion hybrid, where a trained CoMeTS-GAN critic guides denoising through

$q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 1

so that diffusion samples are pushed toward regions judged realistic and correlation-consistent by the critic (Masi et al., 26 May 2026).

4. Architectural families and application regimes

The simplest architectural pattern is a diffusion model trained directly for synthetic path generation. This includes wavelet-image DDPMs (Takahashi et al., 2024), wavelet-domain multilevel diffusion with dedicated transformers per wavelet scale (Wang et al., 13 Oct 2025), and transformer-based sequence denoisers such as Diffusion-TS (Yuan et al., 2024). WaveletDiff is explicitly multiscale: it diffuses each wavelet level separately, uses dedicated transformers for each level, and introduces cross-level attention with adaptive gating. The paper reports that removing cross-level attention worsens discriminative score by about $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 2 and Context-FID by about $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 3 on average, highlighting the importance of selective inter-scale information exchange (Wang et al., 13 Oct 2025).

A second architectural family couples diffusion with auxiliary mechanisms. TIMED augments DDPM generation with teacher-forced autoregressive supervision, Wasserstein feedback, and MMD alignment (EskandariNasab et al., 23 Sep 2025). CoMeTS-GAN plus diffusion uses a separate adversarial model to guide cross-asset correlation structure at inference time rather than retraining the diffusion model itself (Masi et al., 26 May 2026). QDiffusion-TS modifies the denoiser internally by replacing classical feed-forward blocks with QNNs, reducing the number of trainable parameters in each replaced component from 33,088 to 36 (Waller et al., 25 Jun 2026). These designs indicate that finance-oriented diffusion increasingly relies on hybridization when the objective extends beyond unconditional sample realism.

A third architectural family uses diffusion for conditional or task-specific generation. CoFinDiff is designed to generate synthetic returns aligned with arbitrary trend-volatility settings, including rare and extreme regimes (Tanaka et al., 6 Mar 2025). The Bitcoin chart framework is explicitly a generative forecasting system that outputs a future chart four hours ahead (Lee et al., 2 Sep 2025). The compressed-factor model is a latent scenario generator for macroeconomic stress testing (Guo et al., 4 Sep 2025). RN-DDPM modifies the reverse DDPM sampler by a closed-form risk-neutral $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 4-shift so that discounted price paths satisfy the martingale condition under $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 5 (Tiwari, 21 Mar 2026). SBBTS moves further, formulating synthetic financial time-series generation as a Schrödinger–Bass bridge problem that jointly calibrates drift and volatility rather than fixing one of them (Alouadi et al., 8 Apr 2026).

There is also an application-specific denoising branch. A financial time-series denoiser based on conditional diffusion uses the original series as condition, adds TV and Fourier losses during inference, and reports improvements in downstream future-return classification and trading performance (Wang et al., 2024). Although this is not a synthetic-data generator in the narrow sense, it belongs to the same diffusion-based generative toolkit because it uses forward corruption and reverse reconstruction to move a raw series toward a cleaner latent signal (Wang et al., 2024).

5. Evaluation methodology and empirical findings

Evaluation remains heterogeneous because the outputs themselves are heterogeneous. In stylized-fact-oriented studies, the criterion is whether generated data reproduce financial regularities. The GBM-based diffusion model reports that heavy-tailed return distributions, volatility clustering, and the leverage effect are reproduced more realistically than by conventional diffusion models, with GBM generally outperforming VE and VP settings on these diagnostics (Kim et al., 25 Jul 2025). The wavelet-image DDPM evaluates fat tails, autocorrelation decay, intraday seasonality, and cross-correlations among log returns, spreads, and volumes, concluding that the wavelet representation is crucial for intraday seasonality and jointly modeling multiple synchronized series (Takahashi et al., 2024). CoFinDiff evaluates Fisher’s kurtosis, Hill index, and autocorrelation, and reports that it captures both fat tails and volatility clustering while simultaneously meeting specified trend and realized-volatility conditions (Tanaka et al., 6 Mar 2025).

In benchmark-driven studies, evaluation uses generative metrics such as discriminative score, predictive score, Context-FID, correlation mismatch, or DTW-JS. WaveletDiff reports discriminative scores and Context-FID scores that are about $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 6 smaller on average than the second-best baseline across six datasets, including Stocks and Exchange Rate (Wang et al., 13 Oct 2025). DS-Diffusion reports average reductions relative to ImagenTime of 5.56% in predictive score and 61.55% in discriminative score, together with reductions in KL divergence, JS divergence, Wasserstein distance, and KS statistic (Sun et al., 23 Sep 2025). The irregular-sequence generation framework reports an average 74.2% relative improvement in discriminative score at length 24 and an average 85% reduction in computational cost relative to KoVAE, while explicitly noting that predictive and discriminative scores do not by themselves guarantee financial realism (Fadlon et al., 8 Oct 2025).

Task-specific financial evaluations often use more operational diagnostics. The chart-generation framework proposes a simple RGB-based evaluation by reading the upper-right mark in the generated chart and comparing it with the ground truth class. On 781 evaluation pairs, it reports overall accuracy of 68.89% $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 7, but with strong class imbalance: black F1 of 81.59%, blue F1 of 9.52%, and red F1 of 11.43%, indicating that the model captures the dominant “no strong move” class more effectively than minority directional classes (Lee et al., 2 Sep 2025). The compressed-factor diffusion model evaluates real versus generated 6-month cumulative log-return distributions under equal-weight, GMVP, and risk-parity portfolios, and reports that generated-data standard scenario analysis reproduces real-data stress-testing behavior closely, with GMVP mean return reproduced within less than about 0.2% and risk-parity especially accurate (Guo et al., 4 Sep 2025). The CoMeTS-GAN-guided diffusion experiment reports large improvements in pairwise stock-correlation Wasserstein distances under critic guidance, including KO–PEP $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 8 and NVDA–KSU $q(x_t \mid x_{t-1}) = \mathcal{N}\left(x_t; \sqrt{1-\beta_t}\,x_{t-1}, \beta_t I\right),$ 9 (Masi et al., 26 May 2026). QDiffusion-TS reports an average Wasserstein-distance improvement of about 44% relative to its classical counterpart across Apple and Amazon, and a downstream forecasting improvement of up to about 71% in RMSE over a baseline trained only on real data (Waller et al., 25 Jun 2026). SBBTS reports higher classification accuracy and Sharpe ratio than real-data-only training when synthetic data are used for augmentation on S&P 500 forecasting (Alouadi et al., 8 Apr 2026).

A persistent methodological issue is that some evaluations are intentionally simple. The chart-mark metric in (Lee et al., 2 Sep 2025) is explicitly a simple image-marking evaluation method. The compressed-factor paper reports summary statistics rather than formal distributional metrics such as FID or KS distance for its financial section (Guo et al., 4 Sep 2025). Several papers therefore distinguish synthetic similarity from economic validity rather than treating them as interchangeable.

6. Limitations, controversies, and open directions

The first major limitation is that benchmark realism is not identical to financial utility. The irregular-series image-diffusion paper explicitly states that high synthetic similarity does not imply usefulness for trading or risk models (Fadlon et al., 8 Oct 2025). The chart-generation study likewise notes that RGB mark classification may not reflect true financial usefulness and that generated images may differ from ground truth in subtle ways that the metric ignores (Lee et al., 2 Sep 2025). This suggests that discriminative or FID-like metrics should be interpreted as partial diagnostics rather than sufficient validation criteria in finance.

The second limitation concerns domain mismatch in the forward process or prior. TimeBridge is built on the premise that the fixed standard-Gaussian diffusion prior can be ill-suited for time series because it ignores temporal order, continuity, and scale (Park et al., 2024). The GBM-based financial generator makes the same point from a different angle: additive noise does not naturally encode multiplicative price dynamics or price-level-dependent perturbations (Kim et al., 25 Jul 2025). The non-Gaussian Tweedie framework formalizes this as a broader critique of Gaussian diffusion for positive, heteroskedastic financial variables (Tang et al., 19 May 2026). A plausible implication is that finance-specific noising processes are not an implementation detail but a central modeling choice.

The third limitation is scope. Several financial demonstrations are deliberately narrow. The chart-editing model uses only Bitcoin futures data from Binance and is described as an exploratory and limited first step (Lee et al., 2 Sep 2025). The compressed-factor model is empirically focused on a single macro/stock setup and replaces the exact compressed-sensing pipeline with PCA in finance, so the theoretical CSDM bounds do not transfer directly (Guo et al., 4 Sep 2025). TimeBridge discusses finance relevance but does not evaluate on a dedicated financial benchmark beyond the Stocks dataset (Park et al., 2024). QDiffusion-TS validates on Apple and Amazon and uses shorter sequences for hardware experiments because of quantum constraints (Waller et al., 25 Jun 2026).

A fourth issue is temporal causality. Standard diffusion over whole trajectories may fail to preserve adaptedness, which matters for sequential financial decision-making (Cao et al., 4 Jun 2026). AD-Seq addresses this explicitly by generating one coordinate at a time conditioned on previously generated history (Cao et al., 4 Jun 2026). This suggests a substantive divide between static path generation and non-anticipative financial simulation.

Current research directions therefore point in several convergent directions. One is richer conditioning: chart-based forecasting proposes extensions to news, sentiment, FOMC announcements, order book data, or multi-timeframe charts (Lee et al., 2 Sep 2025). Another is stronger multiscale structure, as in WaveletDiff and Diffusion-TS (Wang et al., 13 Oct 2025, Yuan et al., 2024). A third is better treatment of cross-asset dependence, seen in critic-guided diffusion and SBBTS (Masi et al., 26 May 2026, Alouadi et al., 8 Apr 2026). A fourth is finance-aware evaluation that combines stylized facts, pathwise dependence, downstream portfolio or hedging performance, and, where relevant, martingale consistency under risk-neutral dynamics (Tiwari, 21 Mar 2026, Tanaka et al., 6 Mar 2025). Across these directions, the field is moving from generic time-series diffusion toward architectures whose stochastic processes, priors, conditioning interfaces, and evaluation protocols are explicitly aligned with financial structure.