Prediction-Balanced Reservoir Sampling

Updated 13 March 2026

The paper introduces PBRS as a novel method for maintaining a prediction-balanced buffer to enable robust continual test-time adaptation on non-i.i.d. streams.
PBRS employs prediction-balanced insertion and class-conditioned reservoir sampling to effectively mitigate overfitting and class imbalance in dynamic environments.
Empirical evaluations demonstrate that PBRS outperforms prior TTA methods by significantly lowering error rates across benchmarks such as CIFAR10-C and ImageNet-C.

Prediction-Balanced Reservoir Sampling (PBRS) is an algorithmic method designed to mitigate overfitting and improve generalization in continual test-time adaptation (TTA) under temporally correlated (non-i.i.d.) data streams. Emerging from the NEED-robust Online TESt-time-adaptation (NOTE) framework, PBRS systematically maintains a small, temporal memory buffer of test samples that reflect a class-balanced, nearly i.i.d.-like subsample of the non-i.i.d. test stream, as inferred from the model’s own predictions. This enables robust adaptation of normalization statistics in the presence of severe class-imbalance and temporal correlation (Gong et al., 2022).

1. Motivation and Problem Setting

Continual TTA presupposes a model’s operation under distribution shift, adapting on-the-fly using only the incoming stream of unlabeled test data. Many extant TTA algorithms rely on batch statistics (e.g., for recalibration of BatchNorm), or entropy minimization on each batch. These methods tend to overfit, particularly when the data stream exhibits non-i.i.d. behavior—such as temporally correlated sequences common in real-world scenarios—resulting in class-imbalance and model bias toward transient majority classes. PBRS was introduced to address this by simulating an i.i.d. adaptation buffer through prediction- and time-balanced sample selection (Gong et al., 2022).

2. Algorithmic Structure of PBRS

PBRS maintains a fixed-capacity reservoir $M = \{(\mathbf x_i, \hat y_i)\}_{i=1}^N$ of size $N$ , where $\mathbf x_i$ denotes a test sample and $\hat y_i$ its current model-predicted class. For each new test point $(\mathbf x_t)$ with predicted label $\hat y_t$ , PBRS applies two interleaved update mechanisms:

Prediction-Balanced Insertion: If $\hat y_t$ is underrepresented in $M$ relative to other predicted classes, PBRS uniformly selects an instance belonging to the majority (over-represented) class in $M$ and replaces it with $(\mathbf x_t, \hat y_t)$ .
Class-Conditioned Reservoir Sampling: If $\hat y_t$ is not underrepresented, reservoir sampling is performed within class $\hat y_t$ . For class $c$ , the replacement probability for a newly seen sample is $\frac{m[c]}{n[c]}$ , where $m[c]$ is the count of class $c$ in $M$ and $n[c]$ is the cumulative count of class $c$ encountered.

This buffer replacement is performed online with $O(1)$ per-sample overhead.

3. Mathematical Formulation and Buffer Dynamics

Let $n[c]$ denote the running total of test samples with model-predicted class $c$ up to time $t$ , and $m[c]$ the count of class $c$ instances in $M$ . The insertion rules are as follows:

For an incoming test example with $\hat y_t$ $\overset{y}{^}_{t}$ :
- Buffer filling phase: If $|M| < N$ , append $(\mathbf x_t, \hat y_t)$ .
- Prediction-balanced replacement: If $\hat y_t$ is a minority class in $M$ , replace a randomly chosen sample with majority label.
- Class-conditioned sampling: If not, with probability $m[\hat y_t] / n[\hat y_t]$ , replace a randomly chosen buffer sample with label $\hat y_t$ .

Mathematically:

$\Pr(\text{replace a sample of class-}\hat y_t) = \frac{m[\hat y_t]}{n[\hat y_t]}.$

4. Integration with Continual Test-time Adaptation

PBRS operates in tandem with Instance-Aware Batch Normalization (IABN). After every $N$ insertions, the buffer $M$ is used to recompute BatchNorm statistics and update the affine parameters $(\gamma, \beta)$ via a single backward adaptation pass. The global mean $\mu_t$ and variance $\sigma^2_t$ are updated using exponential moving averages:

$\mu_t = (1-m)\mu_{t-1} + m\frac{N}{N-1}\hat\mu_t, \quad \sigma_t^2 = (1-m)\sigma_{t-1}^2 + m\frac{N}{N-1}\hat\sigma^2_t,$

with momentum $m=0.01$ . Here, $\hat\mu_t$ and $\hat\sigma_t^2$ are calculated from activations of the $N$ buffered samples. Only $(\gamma, \beta)$ are optimized through Adam (learning rate $10^{-4}$ ). The buffer size $N=64$ matches a common mini-batch size.

5. Empirical Performance Evaluation

PBRS, solely in conjunction with IABN, was evaluated across multiple benchmarks reflecting severe temporal class-imbalance and real data streams. In the non-i.i.d. setting, mean error rates achieved by NOTE (IABN + PBRS) were:

Dataset	NOTE (IABN+PBRS)	Best Prior Baseline
CIFAR10-C	21.1%	36.2% (LAME)
CIFAR100-C	47.0%	63.3% (LAME)
ImageNet-C	80.6%	82.7%
KITTI-Rain	10.9%	11.3%
HARTH	51.0%	61.0%
ExtraSensory	45.4%	50.7%

NOTE outperforms all other TTA methods (BN-Stats, ONDA, PL, TENT, LAME, CoTTA) in non-i.i.d. streams, and matches or surpasses them when the i.i.d. assumption holds (Gong et al., 2022).

6. Theoretical and Empirical Properties

PBRS does not offer formal unbiasedness proofs but empirically maintains class frequencies in the buffer $M$ close to the long-term average as predicted by the model. The class-conditioned reservoir sampling ensures per-class time-uniform sampling, and the prediction-balanced policy prevents domination by any class under severe drift. Ablation studies demonstrate near-uniform class distribution in $M$ even under pronounced temporal skews, underpinning robust adaptation dynamics (Gong et al., 2022).

7. Implementation Details and Practical Considerations

Key parameters include buffer size $N=64$ , BatchNorm EMA momentum $m=0.01$ , and Adam learning rate $10^{-4}$ for adaptation. Storage requirements are minimal: only $(\mathbf x, \hat y)$ pairs, with computational cost $O(1)$ per sample for buffer management, and $O(N)$ per adaptation step (triggered every $N$ samples). PBRS is always paired with IABN for maximum robustness in the non-i.i.d. paradigm (Gong et al., 2022).

Markdown Report Issue Upgrade to Chat

References (1)

NOTE: Robust Continual Test-time Adaptation Against Temporal Correlation (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Prediction-Balanced Reservoir Sampling (PBRS).

Prediction-Balanced Reservoir Sampling

1. Motivation and Problem Setting

2. Algorithmic Structure of PBRS

3. Mathematical Formulation and Buffer Dynamics

4. Integration with Continual Test-time Adaptation

5. Empirical Performance Evaluation

6. Theoretical and Empirical Properties

7. Implementation Details and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Prediction-Balanced Reservoir Sampling

1. Motivation and Problem Setting

2. Algorithmic Structure of PBRS

3. Mathematical Formulation and Buffer Dynamics

4. Integration with Continual Test-time Adaptation

5. Empirical Performance Evaluation

6. Theoretical and Empirical Properties

7. Implementation Details and Practical Considerations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research