Approximating Likelihood Ratios with Calibrated Discriminative Classifiers (1506.02169v2)

Published 6 Jun 2015 in stat.AP, physics.data-an, and stat.ML

Abstract: In many fields of science, generalized likelihood ratio tests are established tools for statistical inference. At the same time, it has become increasingly common that a simulator (or generative model) is used to describe complex processes that tie parameters $\theta$ of an underlying theory and measurement apparatus to high-dimensional observations $\mathbf{x}\in \mathbb{R}^p$. However, simulator often do not provide a way to evaluate the likelihood function for a given observation $\mathbf{x}$, which motivates a new class of likelihood-free inference algorithms. In this paper, we show that likelihood ratios are invariant under a specific class of dimensionality reduction maps $\mathbb{R}^p \mapsto \mathbb{R}$. As a direct consequence, we show that discriminative classifiers can be used to approximate the generalized likelihood ratio statistic when only a generative model for the data is available. This leads to a new machine learning-based approach to likelihood-free inference that is complementary to Approximate Bayesian Computation, and which does not require a prior on the model parameters. Experimental results on artificial problems with known exact likelihoods illustrate the potential of the proposed method.

Citations (204)

View on Semantic Scholar

Summary

The paper introduces a novel likelihood-free inference method by leveraging discriminative classifiers to approximate likelihood ratios via calibrated dimensionality reduction.
It demonstrates that calibrated classifier outputs can effectively transform high-dimensional density estimation into a simpler one-dimensional problem.
Experimental results in particle physics validate the approach, highlighting its potential to optimize hypothesis testing in complex real-world simulations.

Approximating Likelihood Ratios with Calibrated Discriminative Classifiers

This paper discusses a machine learning-based method for approximating likelihood ratios using discriminative classifiers, specifically tailored for situations where likelihood functions are not readily available due to the complexity of the simulator or the generative model. This work is particularly relevant in fields like high-energy physics, where understanding the relationship between model parameters and high-dimensional data is crucial yet computationally challenging.

Main Contributions

The authors introduce an innovative approach to likelihood-free inference that leverages discriminative classifiers to approximate generalized likelihood ratio tests. The key insight is recognizing that likelihood ratios remain invariant under certain dimensionality reduction transformations, provided that those transformations maintain monotonicity with respect to the likelihood ratio itself. This opens up the possibility of using discriminative classifiers to perform this transformation, effectively transforming the high-dimensional likelihood ratio problem into a simpler one-dimensional density estimation problem.

Experimental Validation: The paper provides proof of concept through experiments on artificial problem sets where the exact likelihoods are known. This experiment demonstrates that the method provides an effective approximation of likelihoods, showing potential applications in complex, real-world settings.

Technical Overview

Likelihood Ratio and Dimensionality Reduction: The likelihood ratio test is a foundational technique for hypothesis testing across various scientific domains. This paper demonstrates that such ratios can be made invariant via carefully selected monotonic transformations, allowing for dimensionality reduction from high-dimensional spaces to the real line, which simplifies the inference process.
Use of Discriminative Classifiers: The authors detail a mechanism wherein discriminative classifiers are trained to distinguish between two hypotheses parameterized by a generative model. These classifiers, after training, serve as an effective means of estimating the likelihood ratio, providing an alternative when direct evaluation of the likelihood function is impractical.
Calibration and Approximation: The calibration process is crucial to ensure that the classifier output directly maps to the target likelihood ratio. The paper elaborates on effective strategies for calibrating the classifier's output, ensuring that the likelihood estimation remains robust and accurate across different parameter settings.
Extension to Composite Hypothesis Testing: The methodology is extended to handle composite hypotheses, which are commonplace in practical scenarios involving continuous parameters. This entails using parameterized classifiers and leveraging a specific calibration strategy, which ensures that the classifier's output scales appropriately across different hypothesis conditions.
Applications in Particle Physics: One of the highlighted applications is in the field of particle physics, particularly in large-scale projects like the Large Hadron Collider experiments. Here, the method promises to optimize the detection of new particles by enabling efficient statistical tests on high-dimensional observational data.

Implications and Future Directions

The implications of this work are significant, particularly in areas where the generative models are complex and simulators are used extensively. By reframing likelihood inference as a classification problem, this approach presents a paradigm shift from traditional Approximate Bayesian Computation (ABC), offering a competitive alternative with potentially better accommodation for frequentist settings.

In the applied research contexts, such as high-energy physics, neural networks and tree-based models predominate for these classification tasks. The authors also recognize the importance of future work in enhancing calibration strategies to improve robustness further. The paper lays a foundation for integrating these methodologies into existing workflows, potentially impacting experimental design and analysis in significant ways.

In summary, this paper presents a thorough exploration and validation of using machine learning for approximating likelihood ratios in a likelihood-free setting, providing a new toolkit for researchers facing the challenges associated with high-dimensional data analysis and parameter inference.

PDF Markdown

Related Papers

GitHub

GitHub - diana-hep/carl: Likelihood-free inference toolbox. (56 stars)

Tweets

https://twitter.com/KyleCranmer/status/1772665000521314479

https://twitter.com/glouppe/status/1851168712851653034