Autoregressive Multiplier Online Bootstrap
- The paper introduces an autoregressive online bootstrap technique that generates dependent weights to mimic intrinsic time series correlations.
- It employs an AR(1) weight sequence with normalization, enabling constant time and memory updates for real-time inference on streaming data.
- Empirical and theoretical analyses confirm robust uncertainty quantification and valid variance estimation under weak dependence.
The Autoregressive Multiplier Online Bootstrap is an online resampling technique for dependent data streams, particularly time series. It generates bootstrap replicates by associating each observation with algorithmically dependent random weights, derived from an autoregressive process with coefficients that approach unity. This ensures increasing local correlation among weights, enabling the method to respect and mimic intrinsic dependencies in the observed data. Designed for streaming contexts, the procedure delivers online updates in constant time and memory per step. Theoretical guarantees hold for a broad class of stationary, weakly dependent processes, with empirical investigations confirming robust uncertainty quantification and coverage properties even in settings with complex dependence structures (Palm et al., 2023).
1. Definition of the Autoregressive Weight Sequence
The core of the method is a time-inhomogeneous AR(1) process generating the sequence of bootstrap weights :
where
- for ,
- are i.i.d. random variables with and , typically .
The process is "nearly non-stationary" due to , inducing strong and growing dependence between successive 0. Weights are normalized at each time via
1
which asymptotically converges to unity, making normalization negligible in large samples. This autoregressive construction allows for locally dependent bootstrap weights, preserving key features of the time series' dependence structure.
2. Online Algorithm and Computational Properties
The method is inherently online, maintaining 2 independent bootstrap replicates of the sample mean 3 for an incoming time series 4. For each replicate 5:
- Initialization: 6, 7, 8.
- At each 9:
- Observe 0.
For 1:
- Draw 2.
- Compute 3.
- Update weight: 4.
- Update running mean: 5.
- Update bootstrap mean:
6
- After updating all chains, summarize the empirical distribution 7 for inference.
Computationally, each new observation costs 8 time and 9 memory. Past data 0 need not be stored and no block recomputation is required.
3. Theoretical Guarantees and Statistical Properties
Let 1 denote the strong-mixing coefficients and 2 for the stationary sequence 3. Under the following conditions:
- (A1): 4,
- (A2): 5 for some 6,
- (A3): 7,
the following results hold for all 8:
- Consistency: The empirical bootstrap law
9
yields valid asymptotic confidence intervals via bootstrap quantiles.
- Variance Estimation: For 0 and the AR-bootstrap variance estimator
1
it holds that: - The bias decreases as 2. - The variance is 3. - The mean squared error is optimized for 4. - 5 in probability.
4. Comparison with Classical Bootstrap Methods
The primary alternatives are block bootstrap and iid-multiplier bootstrap:
- Block Bootstrap: Builds overlapping blocks of length 6, with block resampling or block-level dependent multipliers. Computational cost grows to 7 per change since all blocks must be re-formed when 8 increases, making it noncompetitive for streaming data.
- IID Multiplier Bootstrap: Assigns independent weights to each data point, achieving 9 per update. Such schemes fail under dependent data, with empirical coverage rates collapsing even in the presence of mild short-range correlation.
The AR-multiplier online bootstrap introduces dependence among weights that grows with 0, but remains "locally dependent": multipliers for observations separated by 1 are nearly independent. This structure permits online computation in 2 per observation (per chain) and offers theoretical validity for stationary mixing sequences.
The bias-variance trade-off is summarized in the following table:
| Method | MSE Decay Rate | Update Cost |
|---|---|---|
| Block Bootstrap | 3 (for 4) | 5 |
| AR-multiplier Online | 6 (for 7) | 8 |
For streaming applications, this computational advantage of the AR-bootstrap offsets its slightly inferior statistical efficiency relative to block bootstrap (Palm et al., 2023).
5. Empirical Evaluations on Dependent Data
Empirical results confirm the theoretical claims:
- Linear MA(q) Model: For 9 with 0, the AR-bootstrap achieves nominal interval coverage for the sample mean. IID multipliers fail for 1. The MA-block method is valid but computationally intensive.
- Nonlinear Functionals: For 2 and the statistic 3, bootstrap validity for the sample mean extends by the delta method to smooth transformations.
- MA(2)–GARCH(1,1) Model: For a process 4 combining GARCH and moving average structure, the AR-bootstrap yields approximately correct 90% coverage and consistent variance estimation.
Across all considered scenarios, the AR-multiplier online bootstrap delivers consistent uncertainty quantification with minor variance increases relative to block multiplier schemes, while retaining 5 computation per update.
6. Practical Implementation Guidelines
- Tuning 6: 7 optimizes the mean squared error. Adjustment is possible based on a priori knowledge: increasing 8 reduces variance but induces more bias (faster forgetting); decreasing 9 increases dependence, reducing bias but potentially raising variance.
- Number of Chains 0: Empirically, 1–2 yields stable inference for typical quantile estimation.
- Initialization: Weight chains are initialized at zero; a "burn-in" of 3 steps can be disregarded in small samples to mitigate transient initialization effects.
- Multivariate and Nonlinear Statistics: Apply the same AR-multiplier chain across all dimensions of a vector or employ the delta method for smooth functionals of the mean.
- Computational Load: Each observation requires three multiplications per chain and one standard normal draw; memory cost is 4. There is no need to store the historical data series.
The AR-multiplier online bootstrap thus provides a computationally efficient and theoretically sound alternative to classical block-based bootstrapping for dependent data in real-time environments (Palm et al., 2023).