Likelihood-Ratio Regions in Statistical Inference
- Likelihood-ratio regions are sets defined by thresholding a likelihood ratio statistic, providing clear, confidence-based boundaries in statistical inference.
- They establish a duality between hypothesis testing and interval construction, underpinning methodologies in high-dimensional, empirical, and quantum settings.
- Advanced computational strategies like radial-profile methods and bootstrap calibration enable efficient estimation of these regions in complex models.
A likelihood-ratio region, in frequentist and Bayesian statistical inference, is a set in the parameter or prediction space whose boundary is defined by thresholding the value of a likelihood (or likelihood-ratio) statistic. These regions underpin hypothesis testing, uncertainty quantification, and signal detection frameworks, providing precise mechanisms for confidence estimation, coverage control, and optimality properties. Likelihood-ratio regions play a central role in fields from parametric inference and high-dimensional statistics to quantum state estimation, receiver operating characteristic (ROC) analysis, and neural network uncertainty quantification.
1. Mathematical Definition and Construction
Given a statistical model with parameter , the classical (profile) likelihood-ratio statistic for testing a constraint is
where is the log-likelihood and is the MLE. The likelihood-ratio region at level is then
where and is the -quantile of the 0 distribution. This construction generalizes to multi-parameter and nonparametric contexts, and can be inverted to obtain regions for predictions or quantiles (Jaeger, 2015, Zhang et al., 2020, Tian et al., 2021).
2. Confidence Regions, Hypothesis Tests, and Optimality
Likelihood-ratio regions underpin a duality between Neyman-type hypothesis testing and confidence region construction. In the classical setting, the acceptance region in the sample space for a parameter value 1 is defined as those data points for which the likelihood ratio in favor of 2 over its maximizer (for that data) exceeds a critical threshold:
3
Ordering rules such as the Feldman-Cousins prescription select the most powerful, shortest acceptance regions by including the largest 4 values until the desired coverage is achieved (Karbach, 2011). The resulting regions adapt automatically to boundaries and avoid pathological flip-flop behavior (switching between one-sided and two-sided intervals) while maintaining frequentist coverage guarantees.
The likelihood-ratio approach generalizes to settings without closed-form pivotal quantities. If 5 is a pivotal statistic, likelihood-ratio intervals coincide with exact classical intervals; otherwise, they retain near-nominal coverage via asymptotic 6 calibration or parametric bootstrap (Tian et al., 2021).
3. Likelihood-Ratio Regions in Specialized Frameworks
3.1 Empirical Likelihood and Density-Ratio Models
For inference on population quantiles from several linked populations, one may use empirical likelihood under the density-ratio model (DRM). The ELRT statistic is defined by embedding quantile constraints into the empirical likelihood, and inverting the test to yield a confidence region for quantiles. The ELRT confidence region for 7 quantile constraints is
8
where 9 is twice the difference between the unconstrained and constrained log-empirical likelihoods. Efficient computational strategies leverage root-finding and dual formulations. The DRM+ELRT approach improves coverage and efficiency compared to separate-sample or Wald-based methods (Zhang et al., 2020).
3.2 Bayesian and Quantum State Estimation
In quantum tomography, maximum-likelihood regions (MLR) are defined as those states 0 satisfying 1 for threshold 2, with size fixed by 3 for prior 4. The smallest credible regions (SCR) are minimal-volume sets of posterior probability 5. Both kinds of regions have constant-likelihood boundaries and are optimal in their respective volumetric or credibility senses (Shang et al., 2013).
3.3 ROC Curves and Binary Hypothesis Testing
The Neyman–Pearson framework realizes the optimal ROC as a parametric curve traced out by varying the likelihood-ratio threshold:
6
where the maximum is attained by LR regions 7 (Hajek et al., 2022). When only likelihood-ratio samples are observed, the optimal ROC can be consistently estimated via a finite-dimensional MLE, which solves a convex program for the masses placed on observed values.
4. Extensions: High-Dimensional, Semiparametric, and Modern ML
4.1 High-Dimensional Semiparametric GLMs
For estimation and inference in high-dimensional semiparametric generalized linear models, likelihood-ratio regions are built using profile likelihoods, regularization (SCAD, MCP), and bias correction. The test statistic for a low-dimensional target 8 in a semiparametric GLM with high-dimensional nuisance 9 is
0
where 1 is the profiled log-likelihood. Asymptotic 2 behavior is retained under suitable sparsity and regularity conditions, and the resulting regions achieve nominal coverage in settings with complex structure and incomplete data (Ning et al., 2014).
4.2 Deep Neural Networks
Likelihood-ratio regions for neural network predictions are constructed via constrained optimization over network parameters, testing whether the output at a given test point equals a candidate value. The resulting region is
3
where 4 is twice the difference between the unconstrained and constrained (targeting 5) log-likelihoods. The DeepLR method incorporates the effects of architecture and regularization directly in the feasible parameter set. These intervals are typically asymmetric, adapt to regions of low training data, and can be calibrated via parametric bootstrap (Sluijterman et al., 2023).
5. Computational Strategies and Geometry
Likelihood-ratio regions, especially in multiple dimensions, are computed efficiently using boundary-tracing and root-finding techniques. The radial-profile method parameterizes the region boundary in angular coordinates and solves one-dimensional equations for the radius along each angle (Jaeger, 2015). This is computationally superior to direct grid search, especially in higher dimensions. Practical guidance includes monotonicity checks and vectorized likelihood evaluations.
In Bayesian and quantum contexts, region volumes and shapes are highly prior-dependent, whereas the iso-likelihood surfaces are invariant to prior choice.
6. Signal Detection, Detection Boundaries, and High-Dimensional Phase Transitions
Likelihood-ratio regions demarcate the boundaries between regions of detectability, non-detectability, and partial detectability in high-dimensional hypothesis testing. The phase diagram for sparse signal detection in noisy data is determined by the scaling of signal sparsity and strength, with explicit boundaries calculable for both parametric and nonparametric cases. At the detection boundary, the log-likelihood-ratio statistic converges in distribution to a Gaussian shift model, yielding nontrivial but sub-unity power (Ditzhaus et al., 2017). In contrast, tests such as Tukey's higher criticism become powerless on the boundary. The Pitman asymptotic relative efficiency of the LLR test can be quantified exactly in this regime.
7. Applications and Impact
Likelihood-ratio regions serve as the foundational device for:
- Construction of exact or asymptotically valid frequentist confidence regions and intervals.
- Bayesian credible and maximum-likelihood regions, particularly in quantum state estimation.
- Computation of prediction regions for future observations using inversion of LR statistics.
- Derivation and assessment of ROC curves and AUC statistics from likelihood-ratio samples.
- Quantitative analysis of detectability in sparse high-dimensional inference, with precise determination of statistical phase transitions.
- Quantile estimation and efficiency-improvement via empirical likelihood and DRM in linked-population inference.
- Uncertainty quantification for neural network outputs with automatic adaptation to data geometry, architecture, and regularization.
Efficiency and coverage of likelihood-ratio regions have been investigated and validated by simulation in standard and complex models, routinely outperforming Wald-type and separate-sample methods in real and synthetic datasets (Zhang et al., 2020, Jaeger, 2015, Sluijterman et al., 2023, Ning et al., 2014).
References:
- "Computation of Two and Three Dimensional Confidence Regions with the Likelihood Ratio" (Jaeger, 2015)
- "Empirical likelihood ratio test on quantiles under a density ratio model" (Zhang et al., 2020)
- "Feldman-Cousins Confidence Levels - Toy MC Method" (Karbach, 2011)
- "Optimal error regions for quantum state estimation" (Shang et al., 2013)
- "Likelihood-ratio-based confidence intervals for neural networks" (Sluijterman et al., 2023)
- "Constructing Prediction Intervals Using the Likelihood Ratio Statistic" (Tian et al., 2021)
- "A Likelihood Ratio Framework for High Dimensional Semiparametric Regression" (Ning et al., 2014)
- "Maximum Likelihood Estimation of Optimal Receiver Operating Characteristic Curves from Likelihood Ratio Observations" (Hajek et al., 2022)
- "Detectability of nonparametric signals: higher criticism versus likelihood ratio" (Ditzhaus et al., 2017)