Reverse AIS Estimator (RAISE)

Updated 16 March 2026

The paper introduces RAISE as a stochastic estimator that reliably computes conservative, lower bound log-likelihoods for complex undirected models.
RAISE reframes likelihood estimation by computing an augmented partition function via reverse annealed importance sampling, blending ideas from AIS and CSL.
Empirical evaluations demonstrate that RAISE closely brackets true log-likelihoods with minimal bias compared to AIS and CSL, though it comes with increased computational cost.

The Reverse AIS Estimator (RAISE) is a stochastic estimator designed to provide conservative (lower bound) estimates of log-likelihoods for undirected graphical models such as Restricted Boltzmann Machines (RBMs), Deep Boltzmann Machines (DBMs), and Deep Belief Networks (DBNs). Its primary utility lies in reliably evaluating models for which the partition function is intractable, where standard Annealed Importance Sampling (AIS) may yield over-optimistic model evaluations due to a tendency to underestimate the partition function. RAISE blends ideas from AIS and Conservative Sampling-based Likelihood (CSL) to produce test log-likelihood lower bounds that are both practical and accurate for complex generative models (Upadhya et al., 2015, Burda et al., 2014).

1. Problem Setting and Motivation

Undirected probabilistic graphical models like RBMs define densities of the form $p(x) = f(x) / Z$ , where $f(x)$ is typically straightforward to compute, but the partition function $Z = \sum_{x} f(x)$ is intractable for high-dimensional $x$ . Assessing model fit on held-out test data requires estimating the average test log-likelihood $\mathcal{L} = \frac{1}{N}\sum_{i=1}^N \log p(x^{(i)})$ . Since $Z$ is unknown, one relies on stochastic estimators.

AIS is widely used for partition function estimation. By Jensen’s inequality, $E[\log \hat Z_\text{AIS}] \leq \log Z$ , causing AIS to typically underestimate $\log Z$ , and hence the estimated log-likelihood $\log p(x) = \log f(x) - \log \hat Z$ becomes an overestimate—making AIS a non-conservative, potentially misleading estimator in practice (Burda et al., 2014).

2. RAISE: Formulation and Theoretical Properties

RAISE reframes the estimation of $p(x)$ as the computation of a partition function of an augmented joint distribution, enabling the use of AIS in a reverse mode to yield a stochastic lower bound on $\log p(x)$ . For RBMs, the marginalized likelihood is

$p(v) = \sum_h p(h)\,p(v\mid h),$

where $v$ are the visible units and $h$ the hidden. Identifying the summand as $f(h) = p(h)p(v|h)$ , its sum over $h$ gives the partition function for this "augmented" distribution (Upadhya et al., 2015).

A sequence of $K+1$ intermediate distributions is specified as

$f_k(x) \propto f_0(x)^{1-\beta_k} f_K(x)^{\beta_k}, \quad 0 = \beta_0 < \cdots < \beta_K = 1,$

where $f_0$ is a tractable proposal (e.g., uniform or base-rate), and $f_K$ is the target (Upadhya et al., 2015, Burda et al., 2014).

The RAISE estimator for $p(v)$ after a single reverse chain is

$\hat p(v) = \frac{f_K(v)}{Z_0} \prod_{k=1}^{K} \frac{f_{k-1}(x_k)}{f_k(x_k)},$

where $Z_0$ is the known normalizer for $f_0$ , and the sequence $\{x_k\}$ is constructed by running a Markov chain backwards from the observed $v$ through the annealing path (Upadhya et al., 2015).

The estimator satisfies $E[\hat p(v)] = p(v)$ , and by Jensen’s inequality $E[\log \hat p(v)] \leq \log p(v)$ , ensuring a conservative (lower bound) estimate (Burda et al., 2014).

3. Algorithmic Description

A single-chain version of RAISE for RBMs proceeds as follows (Upadhya et al., 2015):

Input:  test point v, model parameters θ, proposal f₀, β-schedule {β₀,…,β_K}, steps K
Output: estimate  p(v)

1. Precompute: Z₀ ← partition function of proposal f₀ (analytic)
2. Initialize:  x_K ← v; w ← f_K(v)/Z₀
3. For k = K downto 1:
     a) Sample x_{k−1} ∼ T_k(· | x_k) (transitions keep f_k invariant)
     b) Compute r ← f_{k-1}(x_{k-1}) / f_k(x_{k-1})
     c) Update w ← w × r
     d) Set x_k ← x_{k-1}
4. Return: p̂(v) = w

Running this estimator for each test datum and averaging yields an estimate of the average log-likelihood (Upadhya et al., 2015).

For tractable posteriors (e.g., RBMs), the reverse chain can sample directly from $p_K(h|v)$ at initialization. For intractable cases (e.g., DBMs), additional “heating” transitions are necessary as described in (Burda et al., 2014).

4. Hyperparameters and Estimation Quality

The principal hyperparameters include:

Number of intermediate distributions ( $K$ ): Increasing $K$ reduces bias and variance at linear computational cost. Large $K$ is critical for ensuring conservativeness, especially with a uniform proposal.
$\beta$ -schedule: Linear spacing is standard, but denser schedules in regions of rapid distributional change may reduce variance.
Proposal $f_0$ : A uniform proposal is safest for conservativeness but requires large $K$ . A base-rate RBM proposal can yield smaller bias but may slightly overestimate unless $K$ is very large.
Reverse chain runs per datum: While a single reverse chain can suffice for large $K$ , multiple chains further reduce variance (Upadhya et al., 2015).
Variance-reduction: Subtracting $\log f(x)$ as a control variate substantially reduces estimator variance over the test set (Burda et al., 2014).

5. Empirical Performance and Computational Cost

RAISE, AIS, and CSL were compared by (Upadhya et al., 2015) on MNIST using RBMs with 20, 200, and 500 hidden units. Representative log-likelihood results (average over 500 test points) are shown below:

n hidden	AIS	CSL	RAISE (uniform)	RAISE (base-rate)
20	–142.38	–143.58	–145.99	–144.14
200	–112.96	–142.64	–112.46	–109.01
500	–116.46	–154.76	–118.02	–112.04

Interpretation:

AIS generally provides the tightest (but potentially over-optimistic) estimates due to its tendency to underestimate $\log Z$ .
CSL is highly conservative with significant downward bias unless very large numbers of Gibbs samples are used.
RAISE is a lower-bound estimator; for $K \geq 10\,000$ , RAISE's log-likelihood estimates closely approach AIS and the (known for small $n$ ) ground truth, outperforming CSL by a significant margin. A base-rate proposal may slightly overestimate unless $K$ is high, while uniform is strictly conservative (but slower).
Computational cost: RAISE, requiring one full reverse chain per test datum, is two to three orders of magnitude more expensive on MNIST than AIS (which only needs multiple chains per model), and more costly than CSL (Upadhya et al., 2015, Burda et al., 2014).

More extensive experiments on RBMs, DBMs, and DBNs (including Omniglot and larger models, (Burda et al., 2014)) confirm that RAISE typically underestimates test log-likelihood by less than 1 nat relative to AIS, demonstrating that AIS and RAISE tightly bracket the true value. Empirical plots show AIS leveling off to a possibly optimistic value, while RAISE's lower bound rises monotonically to converge just below AIS (Burda et al., 2014).

6. Relationships to CSL and AIS

RAISE synthesizes elements of both CSL and AIS:

CSL: Estimates $p(v)$ by Monte Carlo, but suffers strong downward (conservative) bias due to Jensen's inequality.
AIS: Provides unbiased $\hat Z$ , yielding stochastic lower bounds on $\log Z$ , but upper bounds on $\mathcal{L}$ .
RAISE: By casting $\sum_h p(h)p(v|h)$ as a partition function and applying AIS “in reverse,” RAISE produces an unbiased estimator of $p(v)$ . Jensen’s inequality assures conservativeness: $E[\log \hat p(v)] \leq \log p(v)$ , e.g., a true stochastic lower bound (Upadhya et al., 2015, Burda et al., 2014).

This lower-bound property holds for any finite $K$ ; as $K \to \infty$ , the bound converges to the true log-likelihood of the model.

7. Practical Recommendations

RAISE requires only the MCMC transition kernels used in standard AIS, making implementation straightforward for those with an existing AIS codebase. Gibbs sampling is commonly used for these transitions. Using a data base-rate proposal improves accuracy and reduces error for both AIS and RAISE. Subtracting $\log f(x)$ as a control variate dramatically reduces variance in empirical settings (Burda et al., 2014).

For RBMs with moderate numbers of hidden units, RAISE is feasible on modern hardware, but for datasets with many test examples or expensive forward/reverse chains, the cost may be substantial. In models with intractable posteriors, additional “diagnostic” heating chains are required for correctness (Burda et al., 2014).

RAISE thus provides a robust, conservative methodology for evaluating generative undirected models, complementing standard AIS with rigorous lower-bound guarantees on log-likelihood estimation (Upadhya et al., 2015, Burda et al., 2014).

Markdown Report Issue Upgrade to Chat

References (2)

Empirical Analysis of Sampling Based Estimators for Evaluating RBMs (2015)

Accurate and Conservative Estimates of MRF Log-likelihood using Reverse Annealing (2014)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Reverse AIS Estimator (RAISE).

Reverse AIS Estimator (RAISE)

1. Problem Setting and Motivation

2. RAISE: Formulation and Theoretical Properties

3. Algorithmic Description

4. Hyperparameters and Estimation Quality

5. Empirical Performance and Computational Cost

6. Relationships to CSL and AIS

7. Practical Recommendations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Reverse AIS Estimator (RAISE)

1. Problem Setting and Motivation

2. RAISE: Formulation and Theoretical Properties

3. Algorithmic Description

4. Hyperparameters and Estimation Quality

5. Empirical Performance and Computational Cost

6. Relationships to CSL and AIS

7. Practical Recommendations

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research