Likelihood-Ratio & Info-Theoretic Detectors

Updated 25 October 2025

Likelihood-Ratio and Info-Theoretic Detectors are statistical procedures that optimize detection performance by comparing probabilities under alternative hypotheses and leveraging measures like KL divergence.
The framework distinguishes regimes where scan statistics excel for detecting small-scale signals while averaging (ALR) robustly aggregates evidence for larger-scale signal detection.
Condensed ALR methods improve computational efficiency to O(n log^2 n) without sacrificing optimality, making them practical for large-scale and real-time applications.

Likelihood-ratio and optimal information-theoretic detectors comprise a class of decision rules and statistical procedures that maximize detection performance in hypothesis testing, classification, and signal processing by exploiting the statistical structure of both signal and noise. These detectors are rooted in the framework of likelihood-ratio tests (LRTs), which compare the probability of observed data under alternative hypotheses, and integrate concepts from information theory such as Kullback-Leibler (KL) divergence and error exponents. Optimization over these statistics is essential for problems where signals have unknown extent, involve compositional uncertainty, require robust multi-scale inference, or must be adapted to practical computational or privacy constraints.

1. Core Principles of Likelihood-Ratio and Information-Theoretic Detection

The likelihood-ratio framework is central to optimal detection and statistical inference. For observations $X$ under hypotheses $H_0$ and $H_1$ , the likelihood-ratio $\Lambda(X) = \frac{p_1(X)}{p_0(X)}$ (where $p_i(X)$ is the probability density under $H_i$ ) is used in the Neyman–Pearson lemma to achieve maximal detection power at any given false alarm level.

Information-theoretic detectors generalize this paradigm by incorporating quantities such as the KL divergence, error exponents, and large-deviation rates, which measure how distinguishable two hypotheses are as a function of data volume or system resources. For complex or composite hypotheses, surrogate statistics (profiling, averaging, penalization, etc.) are employed to approach optimality under computational or structural constraints.

In detection problems involving spatial or temporal localization of signals (e.g., segment detection in time series, cluster detection in genomics), two canonical statistics are prevalent: the scan statistic (maximum local likelihood ratio over candidate intervals) and the average likelihood ratio (ALR, which aggregates evidence across intervals). Recent research rigorously characterizes the regime-dependent optimality of these statistics (Chan et al., 2011).

2. Scale-Dependent Optimality of Scan and Average Likelihood Ratio Statistics

In the detection of deterministic signals of unknown spatial extent embedded in Gaussian noise, two likelihood-ratio–based statistics are fundamental:

Scan Statistic: $M_n = \max_{I \in \mathcal{J}_n} \exp\{Y_n(I)^2/2\}$ , where $Y_n(I)$ is the normalized sum over interval $I$ , and the maximization is over the set $\mathcal{J}_n$ of all candidate intervals.
Average Likelihood Ratio (ALR): $A_n = \frac{1}{|\mathcal{J}_n|} \sum_{I \in \mathcal{J}_n} \exp\{Y_n(I)^2/2\}$ , aggregating information over all intervals.

The analytical results demonstrate a sharp dichotomy:

The scan statistic is optimal only for very short intervals, corresponding to signals of the smallest spatial scale, as it exploits the extreme-value theory of maxima over many nearly independent intervals. Mathematically, the optimal detection threshold in this regime is

$|u_n| \sqrt{|I_n|} > (\sqrt{2} + \varepsilon) \sqrt{\log n},$

where $|I_n|$ is the signal's interval length, and $u_n$ its amplitude (Chan et al., 2011).

For larger-scale signals (where $|I_n|$ is bounded away from zero), the scan statistic becomes suboptimal since the noise maximum is dominated by small-scale intervals. The fixed scan threshold ( $\sqrt{2 \log n}$ ) is too conservative for large-scale alternatives, leading to a considerable loss of power.
The ALR, in contrast, is optimal for detecting large-scale signals, as it effectively integrates evidence over contiguous intervals and maintains the optimal error exponent in this regime. However, for the smallest-scale signals, the ALR requires a signal roughly twice as strong as the optimal threshold, yielding only a modest loss relative to the scan.

These behaviors can be summarized in the following regime table:

Signal Extent $\|I_n\|$	Scan Statistic Optimality	ALR Optimality
$\|I_n\| \approx 1/n$ (small)	Optimal	Slightly suboptimal
$\|I_n\|$ bounded away from $0$	Suboptimal (power loss)	Optimal

These properties generalize to cluster detection, density/intensity models, and multivariate signals (Chan et al., 2011).

3. Condensed ALR and Computational Efficiency

Both the scan and ALR, when implemented naively, require $O(n^2)$ computations due to the $O(n^2)$ possible intervals on $n$ points. The paper introduces a condensed ALR that averages only over a sparse, yet “covering,” subset of intervals $\mathcal{J}_{\mathrm{app}}$ selected to approximate all scales and locations.

Formally, the condensed ALR is:

$A_{n, \mathrm{cond}} = \frac{1}{\#\mathcal{J}_{\mathrm{app}}} \sum_{I \in \mathcal{J}_{\mathrm{app}}} \exp\left\{ \frac{Y_n(I)^2}{2} \right\},$

where $\#\mathcal{J}_{\mathrm{app}} = O(n)$ . This statistic maintains statistical optimality across all scales and is computable in $O(n \log^2 n)$ time, a crucial improvement for large $n$ (Chan et al., 2011).

In contrast, scan-based methods require scale-dependent critical values or penalties to restore optimality, complicating their calibration and implementation.

4. Implications for Info-Theoretic Detector Design

Several key design principles emerge:

Non-adaptivity of Scan Statistics: The scan, when implemented with a fixed threshold, lacks adaptivity to the signal’s scale. Restoration of optimality requires interval-length–dependent critical values (penalties), introducing significant calibration overhead.
Robustness and Power of Averaging Procedures: ALR-based detectors offer robustness to the unknown scale, leveraging aggregation to achieve nearly optimal power in both small- and large-scale regimes (with minor signal strength loss for the finest scales only).
Computational Strategies: Deploying a condensed ALR or a penalized scan over a sparse interval set yields both computational tractability and strong statistical performance, crucial for practical large-scale or real-time applications.

These findings have broad implications for high-dimensional detection (epidemiological cluster detection, astrophysical event localization, spatial signal detection in genomics, etc.) and motivate the use of averaging/aggregation in multi-scale inference for unknown-extent alternatives.

5. Mathematical Formulation and Detection Thresholds

The mathematical conditions that precisely delineate the transitions between optimality regimes and signal detectability are given by threshold inequalities. For a signal of unknown extent $|I_n|$ and amplitude $u_n$ , the ALR and scan detection boundaries are as follows:

$|u_n| \sqrt{|I_n|} > \sqrt{2\log(1/|I_n|)} + b_n, \quad \text{with } b_n \to \infty \text{ arbitrarily slowly}.$

The scan achieves optimal detection power only when $|I_n| \approx 1/n$ ; the ALR is optimal for $|I_n|$ bounded away from $0$ (Chan et al., 2011).

The condensed ALR matches the detection boundary for all $|I_n|$ , and is therefore universally optimal in the minimax sense.

6. Applications and Extensions

The described principles extend beyond univariate models to multivariate, intensity-based, and high-dimensional signal detection scenarios:

Cluster and Change Segment Detection: In models for the identification of anomalous regions (clusters) in spatial, temporal, or genomic data, the ALR provides a robust alternative to scan statistics that may fail at larger scales.
Density or Intensity Increments: For detecting "bumps" or jumps in Poisson processes, or abrupt increases in intensity, analogous scan and ALR statistics can be defined, and similar scale-related optimality properties persist.
Multiscale and Multivariate Generalizations: The theoretical framework and computational procedures are generalizable to higher dimensions and to arbitrarily structured interval collections, enabling applications in fields as diverse as astronomy, epidemiology, and bioinformatics (Chan et al., 2011).

7. Summary Table: Comparative Properties

Detector	Statistical Optimality	Computational Cost	Calibration Requirements
Scan Statistic	Only for smallest scales	$O(n^2)$	Scale-dependent thresholds
ALR	All but smallest scales (slight loss)	$O(n^2)$	Simple, but slow
Condensed ALR	Full optimality at all scales	$O(n \log^2 n)$	None
Penalized Scan	Restores optimality with penalties	$O(n \log n)$	Sophisticated calibration

This synthesis reflects a rigorous and practical framework for developing detection procedures that are both statistically optimal and computationally efficient in problems involving signals of unknown or variable extent. These results inform current methodology in statistical signal processing, cluster detection, spatial statistics, and the design of scalable detection algorithms (Chan et al., 2011).

PDF Markdown Chat (Pro)

References (1)

Detection with the scan and the average likelihood ratio (2011)

Follow Topic

Get notified by email when new papers are published related to Likelihood-Ratio/Optimal Info-Theoretic Detectors.