Bellman Conformal Inference (BCI)
- Bellman Conformal Inference (BCI) is a framework that generates calibrated predictive intervals for univariate time series by leveraging dynamic programming to balance interval length and long-term coverage.
- It formulates a one-dimensional stochastic control problem to optimally select interval parameters, acting as a robust wrapper around black-box forecasting models.
- Empirical evaluations show that BCI achieves rigorous non-asymptotic coverage guarantees and produces intervals up to 20% shorter than those from Adaptive Conformal Inference.
Bellman Conformal Inference (BCI) is a framework for producing calibrated predictive intervals for univariate time series by leveraging dynamic programming to minimize average interval length while maintaining rigorous long-term coverage guarantees. BCI operates as a wrapper around arbitrary black-box multi-step forecasting models, directly addressing the potential miscalibration of nominal prediction intervals provided by such models. At each step, BCI formulates and solves a tractable one-dimensional stochastic control problem to select interval parameters, delivering approximately calibrated intervals under arbitrary distribution shifts and temporal dependencies and yielding tighter prediction intervals compared to previous methods such as Adaptive Conformal Inference (ACI) (Yang et al., 2024).
1. Problem Formulation and Calibration Objective
Consider a univariate time series where is revealed only at time , and let represent all observable information up to time . At each time , a black-box forecaster provides, for each horizon , a nominal -level prediction interval
with interval length . While ideally , in practice these prediction intervals may be poorly calibrated.
BCI leverages a data-dependent miscoverage index , adapted to , to determine the prediction interval . Defining the indicator , the calibration objective is strict long-run validity for a pre-specified target :
uniformly for any data-generating process, including adversarial or deterministic sequences. The only assumptions are that is monotonic in (set inclusion) and is the trivial full-space interval.
2. Stochastic Control Problem and Dynamic Programming
BCI addresses interval selection as a finite-horizon one-dimensional stochastic control problem (SCP) at each time . The objective is to choose to minimize expected total interval length, plus a penalty on excess miscoverage: Here, for each , , with drawn from the analyst's empirical estimate of the future probability integral transform (PIT). The scalar weight controls the tradeoff between short intervals and coverage. No constraints are required beyond ensuring ; the safeguard ensures that if , the trivial solution (maximal interval) is always achievable.
The SCP is solved via dynamic programming on the state (number of miscoverages up to ), with terminal cost at : The Bellman update admits explicit computation: With , the optimal action at is
The actual action at time is .
3. Interval Construction and Online Updates
Once is determined, the prediction interval for is . The "uncalibrated PIT" at time is
Then miscoverage is encoded as . The update for is performed via an online-gradient step,
and whenever , BCI defaults to the full-space interval by truncating to zero. This update ensures that
via induction, so that long-run miscoverage is controlled.
4. Algorithmic Workflow
BCI can be summarized in a stepwise form as follows:
- Input: Previous , ; multi-step forecasts ;
- Update the security parameter:
- Solve the stochastic control problem via dynamic programming to obtain for all , ;
- Set
and output ;
- Observe , record , and repeat.
This workflow requires only future-looking forecasts (empirical PITs and nominal interval lengths), compatible with any off-the-shelf forecasting mechanism.
5. Coverage Properties and Theoretical Guarantees
BCI establishes a non-asymptotic bound for average miscoverage. For any starting index and batch of rounds,
This bound guarantees, by sending , that
almost surely, for any data sequence, regardless of stochasticity or stationarity. The approach does not impose assumptions on the forecaster or underlying process.
6. Empirical Evaluation and Comparisons
Empirical assessments utilize datasets including daily logarithmic returns for stocks (e.g., AMD, Amazon, Nvidia), squared return volatility, and Google Trends queries (e.g., "deep learning"). Forecasters include a small transformer for returns, GARCH(1,1) for volatility, and a 5-layer LSTM for Google Trends, each producing nominal intervals by Gaussian quantiles.
The benchmark is Adaptive Conformal Inference (ACI), which recursively updates using
for set step-sizes. Metrics comprise local 500-point moving averages of miscoverage and interval length: and the proportion of intervals of infinite length (signaling uninformative coverage).
Findings from the data include:
- BCI and ACI both achieve near-target 10% miscoverage.
- BCI yields consistently shorter average intervals (e.g., return series: 0.08 vs. 0.09) and avoids uninformative, infinite-length intervals observed with ACI under heavy distribution shifts or loose control.
- Even when forecaster intervals are well-calibrated (e.g., GARCH on volatility), BCI matches ACI in coverage and interval length while robustly avoiding infinite intervals.
- Largest benefits occur when nominal forecaster intervals are poorly calibrated (e.g., LSTM on Google Trends), with BCI reducing average interval widths by approximately 20% at the same coverage level (Yang et al., 2024).
7. Interpretation, Scope, and Relation to Existing Methods
BCI generalizes conformal inference for time series by incorporating dynamic programming and explicit multi-step prediction. Unlike ACI, which updates coverage controllers myopically, BCI reasons over a finite prediction horizon via stochastic control, optimizing the length-vs-coverage tradeoff. The methodology guarantees long-run frequentist coverage under arbitrary nonstationarity, adversarial distribution shifts, and even in the face of poor model calibration, while producing substantially tighter and more informative intervals. A plausible implication is that BCI represents a robust wrapper for any black-box forecasting pipeline, offering rigorous guarantees without assumptions on model correctness or data structure (Yang et al., 2024).