Predictive inference for time series: why is split conformal effective despite temporal dependence?
Published 2 Oct 2025 in stat.ML, cs.LG, math.ST, and stat.TH | (2510.02471v1)
Abstract: We consider the problem of uncertainty quantification for prediction in a time series: if we use past data to forecast the next time point, can we provide valid prediction intervals around our forecasts? To avoid placing distributional assumptions on the data, in recent years the conformal prediction method has been a popular approach for predictive inference, since it provides distribution-free coverage for any iid or exchangeable data distribution. However, in the time series setting, the strong empirical performance of conformal prediction methods is not well understood, since even short-range temporal dependence is a strong violation of the exchangeability assumption. Using predictors with "memory" -- i.e., predictors that utilize past observations, such as autoregressive models -- further exacerbates this problem. In this work, we examine the theoretical properties of split conformal prediction in the time series setting, including the case where predictors may have memory. Our results bound the loss of coverage of these methods in terms of a new "switch coefficient", measuring the extent to which temporal dependence within the time series creates violations of exchangeability. Our characterization of the coverage probability is sharp over the class of stationary, $\beta$-mixing processes. Along the way, we introduce tools that may prove useful in analyzing other predictive inference methods for dependent data.
The paper introduces the switch coefficient to quantify temporal dependence and explains its role in maintaining prediction interval coverage.
It establishes sharp lower and upper bounds for stationary β-mixing processes, highlighting a linear decay in coverage loss with sample size.
The framework is applicable to predictors with memory and black-box ML models, offering practical insights for uncertainty quantification in time series forecasting.
Predictive Inference for Time Series: Split Conformal Prediction under Temporal Dependence
Introduction
This paper addresses the theoretical underpinnings of split conformal prediction for time series data, focusing on the challenge posed by temporal dependence. Conformal prediction is widely used for uncertainty quantification in predictive modeling due to its distribution-free coverage guarantees under exchangeability. However, time series data inherently violate exchangeability due to temporal dependencies, especially when predictors utilize historical observations ("memory"). Despite this, split conformal prediction often performs well empirically in time series contexts. The paper provides a rigorous explanation for this phenomenon, introducing the "1" as a measure of deviation from exchangeability and establishing sharp coverage bounds for stationary, β-mixing processes.
Problem Formulation and Conformal Prediction in Time Series
The predictive inference problem is formalized for a time series Z=(Z1,…,Zn+1), where each Zi=(Xi,Yi) consists of covariates and responses. The goal is to construct a prediction interval for Yn+1 using a predictive model f and the observed data (Xi,Yi)i=1n. Split conformal prediction constructs prediction sets based on the quantiles of conformity scores, which are typically residuals s(z)=∣y−f(x)∣. The method is agnostic to the underlying predictive model and only requires exchangeability for its theoretical guarantees.
In practice, predictors often have memory, i.e., they depend on the previous L observations. This complicates the analysis, as the conformity scores are no longer independent of the calibration data, and the temporal dependence can induce strong violations of exchangeability.
The Switch Coefficient: Quantifying Temporal Dependence
The paper introduces the switch coefficient Ψk,τ(Z), defined as the total variation distance between two specific subvectors of the time series obtained by deleting blocks of entries. The averaged switch coefficient Ψˉτ(Z) quantifies the overall deviation from exchangeability for a given lag τ. For stationary β-mixing processes, the switch coefficient is shown to be bounded by the mixing coefficient, i.e., Ψk,τ(Z)≤2β(τ) for k≤n−τ.
This framework allows the authors to relate the coverage properties of conformal prediction directly to the temporal dependence structure of the data, rather than relying on exchangeability or independence.
Main Theoretical Results
Coverage Guarantees
The central result is a lower bound on the coverage probability of split conformal prediction intervals in terms of the switch coefficient:
This result is sharp, as demonstrated by a matching lower bound up to a universal constant. The coverage loss decays linearly with $1/n$ and increases linearly with the mixing time, providing a more precise characterization than previous results, which only established sublinear rates.
Split Conformal with Data-Dependent Scores
For split conformal prediction where the score function is trained on a subset of the data, the coverage guarantee is extended to account for the dependence between the score function and the calibration data. The bound involves deleting initial calibration scores to mitigate this dependence, and the coverage loss is again controlled by the switch coefficient and the β-mixing coefficients.
Overcoverage Analysis
The paper also establishes an upper bound on the coverage probability, showing that the conformal prediction set is not overly conservative. The switch coefficient provides both lower and upper bounds, ensuring that the prediction intervals are neither too narrow nor too wide.
Implications and Comparison to Prior Work
The results explain the empirical effectiveness of split conformal prediction in time series settings, even when temporal dependence is present. The switch coefficient provides a unified framework for quantifying deviations from exchangeability and can be used to analyze other predictive inference methods for dependent data. The bounds are tighter than those obtained via blocking and empirical process techniques, which typically yield sublinear rates in n.
The analysis accommodates predictors with arbitrary memory and does not require consistency of the predictive model, making it applicable to black-box ML models commonly used in practice. The theoretical guarantees are robust to strong short-range dependence, provided the mixing coefficients decay sufficiently with lag.
Future Directions
The switch coefficient is a natural object for studying stochastic processes and may have applications beyond conformal prediction, such as in online learning and statistical inference for dependent data. The proof techniques, which exploit the stability of quantile functions under addition and deletion of scores, could lead to sharper analyses of other uncertainty quantification methods in dynamic settings.
Conclusion
This work provides a rigorous theoretical foundation for the use of split conformal prediction in time series analysis, demonstrating that coverage guarantees can be maintained under temporal dependence by quantifying deviations from exchangeability via the switch coefficient. The results are sharp and broadly applicable, offering practical guidance for uncertainty quantification in time series forecasting with black-box models. The framework and techniques introduced have potential for further development in the analysis of predictive inference under dependence.