Papers
Topics
Authors
Recent
Search
2000 character limit reached

Intraday Interval-Level Evaluation

Updated 8 February 2026
  • Intraday interval-level evaluation is the systematic partitioning of time-series data into fixed intervals to assess high-frequency forecasting, risk, and structural patterns.
  • It enables detailed diagnostics of model performance, revealing diurnal error structures, regime shifts, and microstructure effects across financial and energy markets.
  • Its applications span high-frequency risk assessment, volatility modeling, and electricity price forecasting, offering actionable insights for trading and risk management.

Intraday interval-level evaluation refers to the systematic assessment of models, forecasts, or statistical phenomena at granular, within-day time scales, typically through partitioning financial or energy market time series into fixed intervals (ranging from seconds to hours) and computing performance, coverage, or risk diagnostics per interval. This methodology is foundational across domains such as high-frequency trading, volatility modeling, electricity price forecasting, and risk management, enabling the detection of diurnal error patterns, regime shifts, microstructure effects, and structural breaks that are indiscernible in daily-aggregated analysis.

1. Definitions, Motivations, and Research Domains

Intraday interval-level evaluation decomposes temporal data into regular subperiods (e.g., 1-min, 5-min, 15-min, 30-min, or hourly intervals) within each day. This approach is motivated by both empirical regularities (U-shaped activity, volatility clustering, return predictability at particular times) and practical demands—such as setting risk margins, executing optimal trades, and generating robust probabilistic forecasts.

Key research domains and use cases include:

2. Interval Construction and Notational Frameworks

Intervals are defined by the business, data, or process under study, with the following common regularizations:

  • Fixed-length calendar intervals: e.g., 1 min, 5 min, 15 min, 30 min, 1 hour.
    • Trading day of TT min: N=T/ΔtN = T/\Delta t intervals.
  • Event-based intervals: e.g., τ\tau consecutive trades, ticks, or mid-quote changes—“event time,” which normalizes for trading activity (0906.3841).
  • Alignment to reference schedules: e.g., 48 half-hour slots in NEM electricity markets (Gani et al., 1 Feb 2026), or overlapping “days” started at each trading hour for margin setting (Cotter et al., 2011).

Formally, let {yt,i}\{y_{t,i}\} denote the quantity of interest (e.g., price, return, forecast error) for day tt and interval i=1,...,Ni=1,...,N. Model evaluation and summary statistics are then computed per interval across the test period.

3. Evaluation Metrics and Diagnostic Procedures

3.1. Error Metrics

Typical univariate metrics, computed and reported at the interval level, include:

Metric Definition (interval ii)
Mean Absolute Error (MAE) MAEi=1Tt=1Tyt,iy^t,i\mathrm{MAE}_{i} = \frac{1}{T}\sum_{t=1}^{T} |y_{t,i} - \hat y_{t,i}|
Root Mean Square Error (RMSE) RMSEi=1Tt=1T(yt,iy^t,i)2\mathrm{RMSE}_{i} = \sqrt{\frac{1}{T}\sum_{t=1}^{T} (y_{t,i} - \hat y_{t,i})^{2}}
Mean Absolute Percent Error (MAPE) MAPEi=100%Tt=1Tyt,iy^t,iyt,i\mathrm{MAPE}_{i} = \frac{100\%}{T}\sum_{t=1}^{T} \big|\frac{y_{t,i} - \hat y_{t,i}}{y_{t,i}}\big|, yt,i0y_{t,i}\ne0
Directional Accuracy (DA) DAi=1T1t=2T1[(yt,iyt1,i)(y^t,iy^t1,i)>0]×100%\mathrm{DA}_i = \frac{1}{T-1}\sum_{t=2}^T 1[ (y_{t,i} - y_{t-1,i})(\hat y_{t,i} - \hat y_{t-1,i}) > 0 ] \times 100\%

3.2. Probabilistic and Interval Metrics

Intraday interval-level probabilistic scores include:

Metric Definition (per interval ii or margin kk)
Coverage Probability Covi=1Tt=1T1{yt,i[Lt,i,Ut,i]}\mathrm{Cov}_i = \frac{1}{T}\sum_{t=1}^T 1\{ y_{t,i} \in [L_{t,i}, U_{t,i}] \}
Interval (Winkler) Score WSt,i={Wt,i,yt,i[L,U] Wt,i+2α(Lyt,i),  yt,i<L Wt,i+2α(yt,iU),  yt,i>UWS_{t,i}= \begin{cases} W_{t,i}, & y_{t,i}\in[L, U] \ W_{t,i}+\frac{2}{\alpha}(L-y_{t,i}),\; y_{t,i}<L \ W_{t,i}+\frac{2}{\alpha}(y_{t,i}-U),\; y_{t,i}>U \end{cases}
CRPS (for probabilistic forecasts) CRPSt,i=1Mm=1Mx~m,t,ixt,i12M2m,n=1Mx~m,t,ix~n,t,i\mathrm{CRPS}_{t,i} = \frac{1}{M}\sum_{m=1}^M|\tilde x_{m,t,i} - x_{t,i}| - \frac{1}{2 M^2}\sum_{m,n=1}^M|\tilde x_{m,t,i} - \tilde x_{n,t,i}|

For functional or curve forecasts (e.g., VIX curves or cryptocurrency return functions), interval-level performance is assessed via empirical coverage per grid point, interval width, Gneiting–Raftery interval score, and optionally CRPS, all evaluated per time grid within the day (Shang et al., 2018, Shang et al., 2023, Jasiak et al., 26 May 2025).

3.3. Multivariate and Pathwise Scores

For multivariate or pathwise intraday intervals (e.g., 4×15-min vector in EPEX-ID3, 10×15-min VWAP in continuous intraday power markets), joint metrics include the energy score (ES) and variogram score (VS), both strictly proper for multivariate distributions (Cramer et al., 2022, Chen et al., 28 May 2025):

Multivariate Metric Key equation
Energy score ESt=1Ns=1Nxtx^t,s212N2s,sx^t,sx^t,s2ES_t = \frac{1}{N}\sum_{s=1}^N \Vert x_t - \hat{x}_{t,s} \Vert_2 - \frac{1}{2N^2} \sum_{s,s'} \Vert \hat{x}_{t,s} - \hat{x}_{t,s'} \Vert_2
Variogram score VSt=i<jwij(xt,ixt,jγ(1/N)sx^t,s,ix^t,s,jγ)2VS_t = \sum_{i<j} w_{ij} \left( |x_{t,i}-x_{t,j}|^\gamma - (1/N)\sum_s |\hat{x}_{t,s,i}-\hat{x}_{t,s,j}|^\gamma \right)^2, usually γ=0.5\gamma=0.5

4. Empirical Findings: Error Patterns, Intraday Regularities, and Regime Shifts

4.1. Diurnal Error Structures

Interval-level evaluation routinely uncovers pronounced diurnal/seasonal error structures. In electricity price forecasting across NEM regions, MAE and RMSE peak during the evening ramp (16:00–20:30), sMAPE surges in midday negative-price regimes, and DA decays in periods of frequent trend changes (Gani et al., 1 Feb 2026). TAS demonstrates lowest errors, while SA and VIC show extreme spikes and highest sMAPE, corresponding to renewable-penetration and volatility profiles.

Intraday volume forecasts display the classic U-shaped curve: open and close intervals exhibit sharp volume spikes, with minima in late morning and early afternoon (Graczyk et al., 2018, Krishnan et al., 2024). The convexity and relaxation exponents of this profile are regime- and period-dependent, subject to microstructure rule changes (e.g., SEC short-sale reforms triggering structural breaks in the post-2008 era).

4.2. Persistence and Memory

Interval-level autocorrelation diagnostics, as in 30-min cross-sectional regressions of stock returns, reveal strong return continuation at daily multiples: lagged coefficients at k=13,26,...k=13,26,... are positive and statistically significant for up to 40 trading days (Heston et al., 2010). These persist even after conditioning on liquidity, order imbalance, and volatility proxies.

4.3. Regime Change, Nonstationarity, and Change-Point Detection

Interval-level analysis is critical for detecting changes in volatility profiles, diurnal shape, or distributional properties. Formal functional-data-based tests for shape- and magnitude-breaks provide explicit, grid-consistent estimators for change-point location and size (Kokoszka et al., 2024). Empirical results in US equities and volatility indices demonstrate pronounced breaks in both diurnal shape and overall volatility, especially around market crises.

4.4. Correlation Processes and Microstructure Interactions

Interval-level estimation of spot correlations between equities displays consistent upward-sloping diurnal patterns: lower correlation in the morning, rising toward the close. Robust nonparametric tests reject the null of time-homogeneous correlation across the majority of months sampled (Christensen et al., 2024). These findings manifest in minimum-variance hedging: time-varying, interval-specific hedge ratios yield significant risk reduction relative to daily-average constant-hedge strategies.

In market microstructure, SVARs estimated per 15-min interval expose sharp diurnal and announcement-driven shifts in the mutual endogeneity of returns and order-flow imbalances (Takahashi, 9 Aug 2025). Price impact peaks at the open, drops at the close, and surges in the presence of macroeconomic announcements, accompanied by distinctive volatility and liquidity patterns.

5. Model Classes and Techniques for Interval-Level Evaluation

A spectrum of methodologies is implemented for interval-level evaluation in contemporary research:

All approaches emphasize interval-level backtesting, cross-validation, and reporting of relevant error or risk metrics per interval, rather than only in aggregate.

6. Implications for Forecasting, Risk Management, and Strategy

  • Forecasting and scheduling: Interval-level diagnostics guide model selection by revealing interval-specific strengths and weaknesses (e.g., LSTM vs transformers in electricity price prediction—short term sensitivity vs horizon robustness) (Gani et al., 1 Feb 2026).
  • Risk management: Margin-sensitivity to interval choice is profound: scaled high-frequency (5-min or 1-hr) return-based margins are systematically higher than daily-close-based ones, prompting redefinition of margining windows and supporting intraday margin call practice (Cotter et al., 2011).
  • Optimal trading and liquidity provision: Accurate interval-level forecasts and error quantification are critical for VWAP-based execution, filling schedules, and anomaly detection in both equities and futures (e.g., early/late volume surges, interval-specific VWAP tracking error) (Lee et al., 2024, Krishnan et al., 2024).
  • Economic evaluation: Marginal statistical gains at the interval level may not translate to economic gains; for example, naive last-interval sell rules can capture much of feasible profit, while pathwise forecast sharpness may have more trading value when timing or selecting extremal intervals matters more than mean accuracy (Chen et al., 28 May 2025).

7. Practical Guidelines and Open Challenges

Several practical consequences and methodological recommendations emerge:

  • Model tuning and feature selection: Interval-specific model selection, calibration, and inclusion of exogenous signals (technical indicators, forecast errors, regime identifiers) are essential for balancing performance across intervals and adapting to structural/nonstationary changes (Krishnan et al., 2024, Gani et al., 1 Feb 2026).
  • Dynamic re-estimation and monitoring: Live forecasting systems should routinely re-estimate interval-level parameters or error profiles, re-calibrate band sharpness, and monitor for shape/magnitude breaks (rolling or binary-segmentation change-point tests) (Kokoszka et al., 2024).
  • Coverage and calibration: Use of conformal prediction and coverage-based metrics ensures empirically valid prediction intervals across intervals and can be tuned for desired operating characteristics (Kath et al., 2019).
  • Interpreting diurnal and regime effects: Forecasting error, variance, and structural patterning must correct for both intraday and longer-run nonstationarities, as well as sectoral or regime-dependent effects in cross-sectional evaluation (Graczyk et al., 2018, Christensen et al., 2024).
  • Communicating uncertainty and actionable outputs: Interval-level forecast outputs should include well-calibrated confidence intervals or risk scores per interval, enabling end-users to quantify time-of-day-dependent risk and adjust positions dynamically.

Open challenges include integration of nonstationary and regime-switching phenomena in interval-level model architectures, scalable joint modeling of high-dimensional multivariate intervals, and robust evaluation strategies under microstructure noise, censoring, and irregular event timing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Intraday Interval-Level Evaluation.