Ranked Set Sampling (RSS) Model
- RSS Model is a stratified sampling technique that employs auxiliary ranking to select measurements and improve estimation efficiency over simple random sampling.
- It supports both balanced (BRSS) and unbalanced (URSS) approaches, using strategies like Neyman allocation to minimize variance and optimize sample allocation.
- The generalRSS package implements robust inferential procedures, including tests for means, medians, proportions, and AUC analysis, enhancing practical data analysis.
The RSS (“Ranked Set Sampling,” Editor's term) model is a stratified sampling technique characterized by rank-based selection and measurement, designed to yield higher efficiency than simple random sampling (SRS) given auxiliary ranking information. It encompasses both balanced (BRSS) and unbalanced (URSS) variants, each enabling tailored allocation strategies and statistical inference. Recent advances, as instantiated by the generalRSS package, provide optimized sample allocation, variance-minimizing designs, and robust inferential procedures, including for one-sample means, medians, proportions, and two-sample area under the curve (AUC) comparisons (Moon et al., 2 Sep 2025).
1. Model Definition and Sampling Schemes
Ranked Set Sampling is parameterized by a set size and a total sample size . The essential procedure is as follows: for each cycle, items are drawn and ranked using an inexpensive or auxiliary process. Then, one specimen per rank stratum is selected for actual measurement. Data thus comprise observations where indexes the measured units within rank , and is the sample size for that stratum.
- Balanced RSS (BRSS): All strata receive equal allocation (; total ), replicating an H-cycle 0 times.
- Unbalanced RSS (URSS): Strata allocations (1) may differ. Sampling continues until each stratum’s quota is reached, supporting efficient estimation in skewed populations.
This configuration generalizes stratified sampling by exploiting a non-measurement-based ranking procedure to enhance the representativeness or informativeness of selected units.
2. Point Estimation and Variance Formulas
The primary estimator for the population mean in RSS is
2
where 3 denotes the stratum mean for rank 4. The variance of 5 is
6
where 7 is the population variance for stratum 8. For implementation, sample variances 9 are substituted.
For BRSS (0),
1
3. Optimal Sample Allocation Strategies
URSS enables variance minimization through allocation optimization given fixed 2. The classical Neyman allocation seeks
3
Three strategies for continuous 4 (as implemented in generalRSS) are:
- Integer Neyman
- Adjusted Neyman (minimal extra units for integer constraints)
- Local-Ratio-Consistent (LRC), a data-adaptive refinement
For binary outcomes, stratum-specific Bernoulli variances 5 guide Neyman-style allocation. This allows URSS to emphasize ranks with greater information yield, crucial in skewed or heteroscedastic populations.
4. Statistical Inference Procedures
A suite of inferential tools exists for RSS data:
- Normal Approximation (z-test):
6
supports large-sample confidence intervals and tests.
- Student’s t-approximation (t-test):
Degrees of freedom estimated using Welch-type or Satterthwaite-type formulas.
- Empirical Likelihood Ratio (ELR) Test: For 7 by maximizing the RSS empirical likelihood; 8.
- Sign Test (Median): Asymptotic pivots for BRSS and URSS; relevant formulas delivered via Hettmansperger and Barabesi approaches.
- Proportion Test: Consistent estimator for population proportions in dichotomous outcomes.
- AUC Comparison (Two-Sample): Mann–Whitney-type empirical likelihood for RSS supports robust area estimation.
The generalRSS package fully implements these procedures for both balanced and unbalanced schemes.
| Test Type | RSS Formula | Applicability |
|---|---|---|
| Mean | 9 and variance formula | All RSS, BRSS, URSS |
| Proportion | 0 | Binary RSS |
| ELR (mean/median) | 1 | All RSS |
| AUC (2-sample) | Mann–Whitney empirical likelihood | All RSS |
5. Real-World Implementation: generalRSS R Package
The generalRSS package operationalizes RSS/BRSS/URSS through:
- Sampling generators (rss.sampling, rss.simulation)
- Allocation strategies (rss.design for Neyman, adjusted Neyman, LRC)
- Inference functions for mean, median, proportion, and AUC (rss.t.test, rss.ELR.test, rss.AUC.test, etc.)
- Concrete workflow supports handling missing data, optimizing allocations, and executing statistical tests.
A typical workflow involves generating samples, 5 and performing inference, 6 with subsequent allocation optimization and confidence interval computation.
6. Applied Examples and Efficiency Gains
Empirical results with NHANES data demonstrate substantial efficiency improvements:
- One-sample Mean (BMI):
- Original URSS: Mean CI length 2 4.93, coverage 3 0.954
- Optimized URSS (adjusted Neyman): CI length 4 4.67, same coverage
- SRS (5): CI length 6 5.93, coverage 7 0.944
- Two-sample AUC (FPG vs. HbA1c):
- URSS: CI length 8 0.171, coverage 9 0.934
- BRSS: CI length 0 0.175, coverage 1 0.930
- SRS (2 per group): CI length 3 0.181, coverage 4 0.930
Allocation strategies targeting informative strata produced tighter intervals and higher mean efficiency compared to SRS and traditional BRSS.
7. Context and Extensions
The modern RSS model leverages auxiliary ranking for variance reduction, especially relevant in environmental, biomedical, and industrial surveys where measurements are costly but ranking is feasible. Extensions such as URSS and LRC allocation, alongside empirical likelihood and robust interval estimation, have expanded the scope and utility of RSS in domain-specific inference. The generalRSS package offers a comprehensive platform for applied RSS design and analysis, validated through large-scale medical data studies and simulation (Moon et al., 2 Sep 2025).
References within the data include works by Chen et al. (2006), Ahn et al. (2022, 2024), Barabesi (2001), Moon et al. (2022), and the implementation in generalRSS v0.1.3.