LFIRE: Likelihood-Free Inference by Ratio Estimation

Updated 26 February 2026

The paper introduces LFIRE, which recasts Bayesian inference as density ratio estimation using logistic regression to bypass intractable likelihoods.
The method employs L1-penalized logistic regression on candidate summary statistics, automatically selecting informative features in high-dimensional contexts.
LFIRE outperforms traditional approaches like ABC and synthetic likelihood, demonstrating robust accuracy in non-Gaussian and complex dynamical models.

Likelihood-Free Inference by Ratio Estimation (LFIRE) is a simulation-based approach for Bayesian parameter inference in models where the likelihood function $p(x\mid\theta)$ is analytically intractable but sampling from the process is feasible. LFIRE recasts the inference task as the estimation of a density ratio between the data-generating process and the marginal data distribution, and employs statistical classification—specifically logistic regression with optional regularization—to estimate this ratio. The resulting posterior is then reconstructed up to the prior by combining the estimated ratio with the prior distribution. LFIRE generalizes and surpasses classical likelihood-free methods such as approximate Bayesian computation (ABC) and synthetic likelihood (SL), especially in handling non-Gaussian posteriors, irrelevant or high-dimensional summary statistics, and providing a theoretically principled framework for automatic statistic selection and robust inference (Thomas et al., 2016).

1. Likelihood-Free Inference and the Role of Density Ratios

The fundamental challenge addressed by LFIRE is parametric inference in simulator-based models $p(x\mid\theta)$ , where direct evaluation of the likelihood for observed data $x_0$ is computationally prohibitive or impossible. Given a prior $p(\theta)$ , the goal is to construct the posterior

$p(\theta\mid x_0)\propto p(\theta)\,p(x_0\mid\theta).$

Since $p(x_0\mid\theta)$ is not available in closed form, LFIRE introduces the central density-ratio idea: $r(x;\theta)\equiv \frac{p(x\mid\theta)}{p(x)},$ where $p(x)=\int p(\theta)p(x\mid\theta)d\theta$ is the marginal (prior-predictive) data distribution. By Bayes' theorem,

$p(\theta\mid x_0)\propto p(\theta)\, r(x_0;\theta).$

Accurate estimation of $r(x;\theta)$ thus enables recovery of the (unnormalized) posterior for arbitrary $x_0$ and parameter prior $p(\theta)$ . This structure distinguishes LFIRE from methods that approximate the likelihood via summary-statistic distances (ABC) or restrictive parametric assumptions (SL) (Thomas et al., 2016, Papamakarios, 2019).

2. Ratio Estimation via Classification and the Logistic Regression Objective

LFIRE frames density ratio estimation as a two-class classification problem. For each $\theta$ of interest, one generates two sets of independent simulated data:

Positive class: $X^+ = \{x^+_i\}_{i=1}^{n^+} \sim p(x\mid\theta)$ ,
Negative class: $X^- = \{x^-_j\}_{j=1}^{n^-} \sim p(x)$ , where $p(x)$ is sampled by first drawing $\theta'\sim p(\theta)$ then $x\sim p(x\mid\theta')$ .

A real-valued score function $h(x)$ parameterizes the classifier. The probability that $x$ is from the positive class is modeled as: $P[x\in X^+\mid h] = \frac{1}{1+\nu\exp(-h(x))},$ with class-imbalance factor $\nu=n^-/n^+$ . The logistic loss is

$J(h;\theta)=\frac{1}{n^++n^-}\bigg[\sum_{x_i^+\in X^+} \log(1+\nu e^{-h(x_i^+)}) + \sum_{x_j^-\in X^-}\log(1+\nu^{-1} e^{+h(x_j^-)})\bigg].$

In the large-sample limit, the minimizer $h^*$ satisfies $h^*(x;\theta)=\log r(x;\theta)$ . The estimated ratio $\hat r(x;\theta)=\exp[h(x;\theta)]$ is then used directly in the posterior formula (Thomas et al., 2016).

3. Summary Statistics and Automatic Selection

To accommodate high-dimensional data, LFIRE restricts $h(x)$ to the linear span of $b$ candidate summary statistics $\psi(x)\in\mathbb R^b$ : $h(x)=\beta^\top\psi(x).$ The resulting problem becomes a penalized (lasso) logistic regression: $\beta^*(\theta) = \arg\min_\beta\{J(\beta;\theta)+\lambda\|\beta\|_1\},$ with regularization parameter $\lambda$ chosen via cross-validation minimizing empirical classification error. This L1-penalty ensures automatic selection of a sparse subset of informative summary statistics from a possibly large candidate set, addressing a key difficulty in earlier likelihood-free methods (Thomas et al., 2016).

The implicit exponential-family approximation induced by this scheme,

$\tilde p(x\mid\theta) = p(x) \exp[\beta^*(\theta)^\top\psi(x)],$

subsumes synthetic-likelihood approaches as the special case in which $\psi$ consists of first and second moments and the likelihood is Gaussian in summary space.

4. Posterior Recovery and Practical Implementation

Inference using LFIRE proceeds by repeating the ratio estimation routine for each parameter value $\theta$ of interest (on a grid or within MCMC/importance sampling). For each $\theta$ :

Simulate data for $X^+$ and $X^-$ ,
Fit the penalized logistic regression to estimate $\beta^*(\theta)$ ,
For observed data $x_0$ , compute $\widehat{r}(x_0;\theta)=\exp[\beta^*(\theta)^\top\psi(x_0)]$ ,
Form the unnormalized posterior $p(\theta)\widehat{r}(x_0;\theta)$ ,
Optionally, sample from this distribution using standard techniques.

The dominant computational cost arises from the need for repeated forward simulations, while the overhead from regularized logistic regression is negligible compared to simulation time in typical applications (Thomas et al., 2016).

5. Empirical Performance, Extensions, and Robustness

LFIRE has demonstrated superior robustness and accuracy compared to ABC and synthetic likelihood, particularly in the following regimes:

Non-Gaussian posteriors: E.g., ARCH(1) and time-series models where Gaussian assumptions of SL are inadequate. LFIRE returned accurate posteriors while remaining robust to extraneous summary statistics.
High-dimensional summaries: In cell-spreading (291 candidate statistics), LFIRE selected a small, relevant subset and concentrated the posterior effectively, whereas SL required orders of magnitude more simulations for similar accuracy.
Complex dynamics: In ecological and weather models (e.g., Ricker map, Lorenz system), LFIRE handled non-Gaussianity and complex summary structure better than competitors.

The method generalizes to dynamic and time-series data by using deep neural networks (e.g., 1D CNNs) to learn summary statistics, as in the DIRE extension for dynamical systems (Dinev et al., 2018). Extensions also incorporate amortized (global, parameterized) ratio estimators, advanced sampling strategies, and additional structure (kernel methods, path signature embeddings) for time-series and high-dimensional data (Dyer et al., 2022, Izbicki et al., 2014).

6. Theoretical Properties and Connections to Contrastive Learning

LFIRE's core procedure is consistent: under sufficient capacity and data, the ratio estimator converges to the true $r(x;\theta)$ , and thus the posterior converges to the correct Bayesian posterior. LFIRE is recognized as the $K=2$ case of a more general contrastive (multi-class) learning objective; this connection has unified density-ratio estimation, regression-based, and classifier-based likelihood-free inference in a common statistical framework (Durkan et al., 2020). Recent work has shown that augmenting the classifier loss with additional quantities from partially-white-box simulators (e.g., joint likelihood ratios, joint scores) leads to substantial sample-efficiency gains (Stoye et al., 2018, Brehmer et al., 2018).

LFIRE also provides a robust alternative to frequentist discriminative MLE approaches (as in “Carl”), being more stable to prior misspecification since it estimates the full Bayesian posterior, not just individual likelihood ratios relative to a fixed reference parameter (Thomas et al., 2016).

7. Summary Table: Canonical LFIRE Workflow

Step	Action	Technique
Data simulation	$X^+\sim p(x\mid\theta)$ , $X^-\sim p(x)$	Simulator forward calls
Feature construction	Compute $\psi(x)$ for all $x$	Candidate summaries or learned CNN
Logistic regression	Fit $h(x)=\beta^\top\psi(x)$ via penalized loss	L1-regularized convex optimization
Statistic selection	Cross-validate λ, select sparse $\beta$	K-fold CV for classification error
Posterior computation	$\widehat{r}(x_0;\theta)$ , combine with prior for posterior	Plug-in formula
Uncertainty quantification	Grid or sampler over $\theta$	MCMC or importance sampling

LFIRE thus offers an efficient, flexible framework for likelihood-free Bayesian inference in intractable generative models, supporting principled statistic selection, extensibility to complex data structures, and strong theoretical guarantees (Thomas et al., 2016, Dinev et al., 2018, Durkan et al., 2020).