Private Conformity via Quantile Search (P-COQS)
- The paper introduces P-COQS, a framework that replaces direct quantile computation with a noisy binary search to ensure differential privacy in calibration.
- The method leverages randomized binary search with zCDP composition, providing bounded rank error and near-nominal coverage for prediction sets.
- Empirical results on benchmark datasets show that P-COQS achieves efficient, robust predictions with smaller set sizes and controlled privacy loss.
Private Conformity via Quantile Search (P-COQS) is a methodological framework for constructing differentially private uncertainty-quantifying prediction sets, primarily in the context of conformal prediction. The core idea is to protect privacy in the critical calibration phase of split conformal prediction by employing a differentially private quantile search, thereby ensuring that the released prediction sets maintain rigorous privacy guarantees while approximating the statistical properties (notably coverage) of their nonprivate counterparts. P-COQS leverages randomized binary search augmented with privacy noise in order to robustly and efficiently estimate quantiles required for prediction sets, and has been empirically validated against leading private conformal prediction alternatives across both simulated and large-scale benchmark datasets (Romanus et al., 15 Jul 2025).
1. Motivation and Quantile Search in Conformal Prediction
Traditional split conformal prediction (CP) methods assign prediction sets by calibrating a quantile (typically, the quantile) of nonconformity scores, calculated using a held-out calibration dataset. While this guarantees marginal coverage, direct computation or release of the quantile (or of nonconformity scores) can disclose individual-level information, thereby risking privacy breaches. Addressing this, P-COQS seeks to provide formal, rigorous privacy protections for the calibration phase via differentially private quantile estimation.
At the methodological core, P-COQS re-implements the CP calibration quantile computation as a noisy, iterative binary search. Rather than selecting the empirical quantile directly, the algorithm divides the score domain and repeatedly queries which half the private quantile lies in, replacing the exact count query with a privatized noisy count (for instance, via the Gaussian mechanism). This induces a randomized partitioning process that converges to an approximately correct quantile while controlling privacy loss (Romanus et al., 15 Jul 2025).
2. Algorithmic Structure and Privacy Mechanism
P-COQS employs a randomized binary search to identify the quantile of interest in a set of calibration scores . At every step, given an interval :
- Compute the midpoint .
- Use a differentially private noisy count, e.g.,
where is chosen according to the privacy parameter (zero-concentrated differential privacy (zCDP), with for -zCDP).
- Depending on the noisy count, update or to narrow the bracket containing the target quantile.
- After iterations, output the midpoint as the DP quantile estimate.
The global privacy guarantee across all iterations results from zCDP composition; the total privacy budget is distributed over the binary search steps. This mechanism ensures that the computed quantile maintains -zCDP (convertible to -differential privacy if desired), thus providing a rigorous upper bound on disclosure risk for the calibration data (Romanus et al., 15 Jul 2025).
3. Approximate Coverage and Theoretical Guarantees
The introduction of privacy noise creates a quantifiable rank error relative to the empirical quantile. With high probability, the output quantile matches the ideal quantile within a rank error
where and is the failure probability.
When applied to calibrate prediction sets in conformal prediction, this rank error translates into a maximum possible deviation from nominal coverage of order , where is the size of the calibration set. This reflects the finite-sample and noise-induced shortfall—prediction sets may slightly under-cover the target probability . Nevertheless, empirical evaluations show this under-coverage is typically minor and accurately bounded by the theoretical analysis (Romanus et al., 15 Jul 2025).
4. Empirical Performance and Comparison
Extensive experiments—including on high-dimensional vision datasets such as CIFAR-10, ImageNet, and CoronaHack—demonstrate key properties of P-COQS:
- Coverage: Prediction sets usually achieve coverage levels close to , with quantified and bounded under-coverage.
- Efficiency and Informativeness: The method often produces smaller prediction sets (greater informativeness) than leading alternatives, such as methods based on the exponential mechanism with quantile inflation (ExponQ).
- Robustness to Privacy Noise: Across a range of privacy budgets ( or ), performance in terms of coverage and set size remains stable.
- Computational Efficiency: The binary search does not require complex optimization or discretization hyperparameters, yielding significantly faster execution than ExponQ or related DP quantile routines (Romanus et al., 15 Jul 2025).
Empirical coverage, efficiency, and informativeness metrics are consistently reported as favorable, and all coverage shortfalls are tightly explained by the theoretical error bounds arising from privacy noise and the binary search approximation.
5. Privacy, Utility Trade-offs, and Limitations
By leveraging zCDP composition and the binary search protocol, P-COQS achieves a principled trade-off: privacy is guaranteed under formal (zCDP or -DP) definitions, while the sacrificed coverage is predictable and usually limited. The required number of iterations grows only logarithmically in the score domain's resolution, making the privacy cost scalable for reasonably-sized calibration sets and score domains.
A notable limitation is the slight under-coverage at finite sample sizes: the introduced noise can occasionally cause the selected quantile to fall short of the nominal threshold. Although this is controlled and typically small, it becomes progressively less significant as increases or as privacy noise is reduced.
6. Extensions, Practical Implications, and Future Directions
P-COQS provides a modular and model-agnostic approach to differentially private conformal prediction, applicable to any model for which nonconformity scores are defined and bounded. It is particularly relevant in settings where the calibration data is highly sensitive and privacy requirements are mandated by policy or regulation. The approach is computationally attractive for deployment with large datasets, non-tabular data, or when multiple quantiles require simultaneous estimation.
Future directions highlighted by the analysis include:
- Developing mechanisms to further narrow the under-coverage gap, potentially via distributional assumptions or more adaptive noise allocations.
- Generalizing to continuous-output prediction intervals and regression settings.
- Integrating advanced privacy accounting for more complex analytic pipelines or repeated application of the procedure.
In summary, Private Conformity via Quantile Search (P-COQS) offers a theoretically grounded and empirically validated approach to producing calibrated, efficient, and privacy-preserving prediction sets, bridging the requirements of statistical validity and individual privacy for uncertainty quantification tasks in modern machine learning and data analysis (Romanus et al., 15 Jul 2025).