Repeat Factor Sampling for Object Detection

Updated 27 March 2026

Repeat Factor Sampling (RFS) is a data-driven strategy that mitigates class imbalance in object detection by elevating the sampling probability for images with rare classes.
It calculates class-specific repeat factors based on image frequencies and normalizes them, ensuring underrepresented objects receive more training exposure.
Instance-Aware and Exponentially Weighted variants enhance RFS by incorporating instance-level statistics and exponential scaling, leading to significant performance gains on long-tailed datasets.

Repeat Factor Sampling (RFS) is a data-driven rebalancing strategy principally developed for addressing class imbalance in large-scale, long-tailed object detection datasets. It has become a foundational approach for sampling-based mitigation of rare class under-representation, and has inspired a sequence of methodological advancements (notably Instance-Aware Repeat Factor Sampling and exponential-weighted variants) for practical deployment in modern object detectors. RFS additionally indexes a distinct notion in quantum recursive Fourier sampling, but the prevailing usage in machine learning and computer vision focuses on its role in training procedure design.

1. Fundamental Principles of Repeat Factor Sampling

RFS was introduced to mitigate the severe skew arising when a small set of classes dominates the majority of annotated images, while many rare categories appear in only a few images. The core idea is to increase the sampling probability of images that contain rare classes, thereby exposing the detector to more training instances of underrepresented objects.

Formally, let $C$ denote the set of all classes and $N$ the total number of training images. For each class $c$ , define the image frequency as

$f_{n,c} = \frac{|\{i : \text{image } i \text{ contains at least one object of class } c\}|}{N}$

and fix a rarity threshold $t \in (0,1)$ , often $t=10^{-4}$ or $t=10^{-3}$ . The class-specific repeat factor is then

$r_c = \max\left(1, \sqrt{\frac{t}{f_{n,c}}}\right)$

For each image $i$ , the image-level repeat factor is

$r_i = \max_{c \in \mathrm{classes}(i)} r_c$

The sampling probability for image $i$ in one epoch is normalized as

$p_i = \frac{r_i}{\sum_{j=1}^N r_j}$

An image containing any rare class ( $f_{n,c} \ll t$ ) will have an elevated $p_i$ and hence is sampled more frequently during training (Yaman et al., 2023, Ahmed et al., 27 Mar 2025).

2. Motivation and Limitations

RFS specifically targets the imbalance in image-level class frequencies, but ignores the number of object instances per class. This can be problematic: for two rare classes occurring in the same number of images, a class with many instances (larger annotation density per image) is under-sampled relative to one with fewer instances, since RFS assigns the same $r_c$ regardless of total instance count (Yaman et al., 2023). This limitation results in inadequate sampling for rare but instance-rich classes, which directly impacts detection performance for these underrepresented categories.

3. Instance-Aware Extensions and Generalizations

To remedy the insensitivity of RFS to per-class instance counts, Instance-Aware Repeat Factor Sampling (IRFS) fuses image-level and instance-level statistics through a geometric mean. For each class $c$ ,

$f_{i,c}$ : fraction of images containing $c$
$f_{b,c}$ : fraction of all bounding boxes labeled as $c$
Geometric mean: $g_c = \sqrt{f_{i,c} \cdot f_{b,c}}$

The IRFS repeat factor:

$r_c^{\mathrm{IRFS}} = \max\left(1, \sqrt{\frac{t}{g_c}}\right)$

Image-level repeat factors and probabilities are then set as for RFS, with $r_i^{\mathrm{IRFS}} = \max_{c \in \mathrm{classes}(i)} r_c^{\mathrm{IRFS}}$ and normalized probabilities $p_i$ (Yaman et al., 2023, Ahmed et al., 27 Mar 2025). This adjustment enables IRFS to properly account for jointly rare image occurrence and instance count per class, substantially improving model exposure to difficult classes.

Empirical evidence demonstrates that IRFS yields significant improvements over RFS on long-tailed datasets. On LVIS v1.0 with a Mask R-CNN baseline (ResNet-50), rare class AP increases from 9.2 (RFS) to 14.1 (IRFS), a +53% relative gain, with comparable or superior performance observed across detection architectures and ablation settings.

4. Exponentially Weighted IRFS (E-IRFS)

Despite the geometric mean fusion in IRFS, classes that are extremely rare by both image and instance frequencies may remain underexposed. Exponentially Weighted IRFS (E-IRFS) introduces exponential scaling to further amplify the repeat factors for extremely rare classes:

$r_c^{\text{E-IRFS}} = \exp\left[\alpha \sqrt{\frac{t}{g_c}}\right]$

where $\alpha > 0$ is a scaling hyperparameter (often $\alpha \approx 2.0$ ). The exponential function achieves super-polynomial separation between rare and frequent classes, aggressively increasing the sampling weight of images containing ultra-rare objects while maintaining stability for common classes.

E-IRFS has been validated on UAV-based surveillance benchmarks (e.g., fire, smoke, human, lake detection), showing improvements of +22% in mAP $_{50}$ (from 0.49 to 0.55) and +350% in mAP $_{50}$ for the rarest class ("Lake") using YOLOv11-Nano models. Empirically, the benefit is even more pronounced in lightweight models with limited capacity, suggesting that stronger repeat factor scaling is crucial for effective representation learning in resource-constrained settings (Ahmed et al., 27 Mar 2025).

5. Integration and Computational Properties

RFS, IRFS, and E-IRFS are designed as sampling strategies at the data-loading stage. Their integration involves preprocessing (counting image and instance frequencies per class), calculation of per-class repeat factors, and normalization of sampling probabilities. The computational overhead is limited to the frequency computation at initialization; there is no increase in training-time compute per forward pass, nor any modification to the learning rate schedule, augmentation, or network architecture (Yaman et al., 2023).

Sampler variants function as drop-in replacements in frameworks such as MMDetection and Detectron2, with support for both two-stage and one-stage detectors. Recommended hyperparameters are $t=10^{-4}$ (or $10^{-3}$ in LVIS) and an empirically-selected $\alpha$ for E-IRFS; excessively high $r_c$ values should be capped to avoid memory exhaustion in large-scale pipelines (Ahmed et al., 27 Mar 2025).

6. Comparative Empirical Analysis

A summary of performance for RFS-based methods is provided in the following table (Mask R-CNN, LVIS v1.0, ResNet-50 backbone) (Yaman et al., 2023):

Sampling Method	mAP_bbox	AP_r (Rare)	AP_c (Common)	AP_f (Frequent)
None	16.9	0.0	12.3	29.6
RFS	22.7	9.2	21.3	30.0
IRFS	24.4	14.1	22.8	30.7

Extended to the UAV surveillance domain, E-IRFS delivers additional mAP gains compared to both RFS and IRFS, particularly in the most under-represented classes and with limited-capacity network scalings (Ahmed et al., 27 Mar 2025). Moreover, IRFS demonstrates synergy with instance-level reweighting loss (e.g., Equalization Loss, ECM), producing further improvements when used in tandem.

7. Broader Applications and Theoretical Considerations

Although developed for object detection with long-tailed distributions, the RFS framework and its generalizations are applicable to any learning scenario with rare-class exposure bottlenecks. RFS methods provide a principled, architecture-agnostic mechanism for controlling class distribution at the data level. Theoretically, IRFS and E-IRFS exhibit monotonicity and convexity in their repeat factor scaling, with E-IRFS introducing exponential separation for the rare-class regime:

Sublinear adjustment: IRFS grows $\sim (f_{i,c}\cdot f_{b,c})^{-1/4}$
Super-polynomial adjustment: E-IRFS grows $\sim \exp[\alpha (f_{i,c}\cdot f_{b,c})^{-1/4}]$

Selecting the geometric mean as the fusion strategy yields robust trade-offs, with ablation results indicating that geometric and harmonic means outperform arithmetic or quadratic averaging for rare class recall. Optimal threshold selection is data-dependent; excessively high or low thresholds can degrade overall performance (Yaman et al., 2023, Ahmed et al., 27 Mar 2025).

In summary, Repeat Factor Sampling (RFS) and its instance-aware and exponential generalizations are empirically and theoretically validated solutions for rare-class undersampling in long-tailed object detection, offering scalable, computationally efficient, and easily integrated capabilities that are crucial for modern vision pipelines (Yaman et al., 2023, Ahmed et al., 27 Mar 2025).

Markdown Report Issue Upgrade to Chat

References (2)

Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection (2023)

Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Repeat Factor Sampling (RFS).