Semistochastic Heat-Bath CI (SHCI)

Updated 18 January 2026

Semistochastic Heat-Bath Configuration Interaction (SHCI) is a method that achieves near–full-configuration-interaction accuracy by combining efficient variational heat-bath determinant selection with semistochastic perturbative corrections.
It employs a heat-bath screening rule to select only significant determinants and uses a three-tier semistochastic PT2 algorithm to control computational cost and memory use.
SHCI enables practical FCI-level treatments for complex molecular systems, delivering chemical accuracy through optimized orbital selection and scalable parallel computing strategies.

Semistochastic Heat-bath Configuration Interaction (SHCI) is a selected configuration interaction plus perturbation theory (SCI+PT) method that enables near–full-configuration-interaction (FCI) accuracy for electronic structure calculations at a dramatically reduced computational cost. By combining a compact variational expansion obtained via a “heat-bath” determinant selection rule with a memory- and time-efficient semistochastic second-order Epstein–Nesbet perturbative correction, SHCI enables practical FCI-level treatments for molecular systems with single- or multi-reference character and large electron counts (Yao et al., 2021, Jerzyk et al., 19 Nov 2025, Chien et al., 2018, Sharma et al., 2016, Li et al., 2018, Wang et al., 2023, Yao et al., 2020, Mussard et al., 2017, Holmes et al., 2017, Yao et al., 2021).

1. Theoretical Framework and Determinant Selection

SHCI operates in two principal stages: variational determinant selection and semistochastic perturbative correction. In both stages, the computational efficiency is rooted in an aggressive yet rigorously controlled screening criterion.

Variational ansatz and heat-bath selection:

The variational wavefunction is expanded over a space $V$ of Slater determinants: $|\Psi_V\rangle = \sum_{i\in V} c_i |D_i\rangle,$ with coefficients $c_i$ obtained by diagonalizing the electronic Hamiltonian restricted to $V$ . The variational energy is $E_V = \langle\Psi_V|\hat H|\Psi_V\rangle$ .

New external determinants $D_a$ are added to $V$ if they meet the heat-bath criterion: $\exists\, D_i \in V: \quad |H_{ai}c_i| \geq \epsilon_1,$ where $H_{ai} = \langle D_a|\hat H|D_i\rangle$ and $\epsilon_1$ is a user-specified variational threshold. In practice, SHCI precomputes and sorts two-electron integrals by magnitude, and for each determinant only generates excitations whose matrix elements exceed $\epsilon_1/|c_i|$ . This eliminates explicit enumeration of the entire excitation space and ensures that only the most significant determinants are selected (Yao et al., 2021, Chien et al., 2018, Li et al., 2018).

2. Semistochastic Second-order Perturbative Correction

Following variational convergence, the correlation energy outside $V$ is estimated via Epstein–Nesbet second-order perturbation theory (PT2): $\Delta E^{(2)} = \sum_{D_a\in P} \frac{\left( \sum_{i\in V} H_{ai} c_i \right)^2}{E_V - H_{aa}},$ where $P$ is the set of determinants connected to $V$ but not contained within it, and $H_{aa}$ is the diagonal element for the external determinant $D_a$ .

To ensure tractability, SHCI introduces a perturbative screening threshold $\epsilon_2 < \epsilon_1$ , only including numerator terms $|H_{ai}c_i| \geq \epsilon_2$ (Yao et al., 2021, Jerzyk et al., 19 Nov 2025, Sharma et al., 2016). The full PT2 sum is partitioned into three components:

Deterministic part: Contributions above a larger cutoff ( $\epsilon_2^{\rm dtm}$ ), accumulated exactly.
Pseudo-stochastic part: Intermediate cutoff ( $\epsilon_2^{\rm psto}$ ), where batches of external determinants are sampled and summed.
Fully stochastic part: Remaining contributions down to $\epsilon_2$ , estimated via random sampling.

This three-step scheme yields unbiased energy estimates with a statistical error that can be made arbitrarily small by increasing the sample count. The entire memory bottleneck associated with forming all partial sums for large spaces is circumvented; at most, only one batch of samples is held in memory during each pass (Yao et al., 2021, Sharma et al., 2016, Li et al., 2018).

3. Full Algorithmic Workflow and Implementation

The canonical SHCI workflow proceeds as follows (Yao et al., 2021, Chien et al., 2018, Jerzyk et al., 19 Nov 2025, Li et al., 2018):

Initialization: Input one- and two-electron integrals, initial determinant (often Hartree–Fock), and user thresholds $\epsilon_1, \epsilon_2$ .
Variational iteration:
- Diagonalize $H$ in current $V$ by the Davidson method to obtain $\{c_i\}$ and $E_V$ .
- For all determinants $D_i \in V$ , generate and add external determinants $D_a$ with $|H_{ai}c_i| \geq \epsilon_1$ .
- Terminate when energy change $\Delta E_V$ falls below tolerance.
Perturbative correction:
- Compute PT2 energy semistochastically as detailed above.
- Form total energy as $E^{\rm SHCI} = E_V + \Delta E^{(2)}$ .
FCI extrapolation:
- Repeat for several $\epsilon_1$ values, collect pairs $(E^{\rm SHCI}, -\Delta E^{(2)})$ .
- Fit $E^{\rm SHCI}$ as a function of $-\Delta E^{(2)}$ (typically quadratic with weights $\propto 1/[\Delta E^{(2)}]^2$ ) and extrapolate to $\Delta E^{(2)} \to 0$ to estimate the FCI energy.

This workflow is implemented using bit-packed representations of determinants, distributed-memory parallelization (MPI+OpenMP), and hash-based data structures for efficient batching in the perturbative step (Li et al., 2018).

4. Performance, Scaling, and Computational Aspects

SHCI achieves sublinear or near-linear wall-clock scaling with respect to the size of the variational space for much of its workflow due to the heat-bath selection criterion and fast Hamiltonian construction methods (Yao et al., 2021, Li et al., 2018). Specific features include:

Variational selection: Negligible cost per determinant due to pre-sorted integral lists; complexity typically $O(N_{\rm det}^{1.2} - N_{\rm det}^{1.5})$ , where $N_{\rm det}$ is the number of variational determinants.
Hamiltonian construction: Auxiliary arrays and hashing reduce the time to generate Hamiltonian matrix elements to nearly linear in $N_{\rm det}$ .
Davidson diagonalization: Parallelization is employed for both Hamiltonian build and eigensolving.
Perturbative correction: Naïve cost $O(N_{\rm elec}^2 N_{\rm unocc}^2 N_{\rm det})$ is rendered practical by aggressive screening and semistochastic evaluation; batching and stochastic sampling further lower memory and runtime.
Memory use: Scales as $O(N_{\rm det})$ in the perturbative step, with only the largest batch of partial sums held in core memory at any time.
Parallelization: Both variational and perturbative stages benefit from distributed-memory (MPI) and shared-memory (OpenMP) parallelism. The perturbative step is particularly amenable to embarrassingly parallel sampling (Li et al., 2018, Sharma et al., 2016).

Benchmarks demonstrate SHCI wall-times ranging from seconds to hours for systems with up to billions of determinants and Hilbert spaces exceeding $10^{32}$ determinants (Li et al., 2018, Yao et al., 2020).

5. Applications and Practical Impact

SHCI has been systematically applied to both main-group and transition-metal systems, including those with strong static and dynamic correlation (Chien et al., 2018, Jerzyk et al., 19 Nov 2025, Yao et al., 2021, Wang et al., 2023). Notable application areas:

Benchmark FCI energies: Accurate atomization energies, vertical excitation energies, and zero-field splittings in weakly and strongly correlated molecules (Yao et al., 2020, Chien et al., 2018).
Transition metals and heavy elements: Treatment of Cr, Mo, and W with effective core potentials and orbital optimization, and relativistic SHCI formulations for heavy elements (AuH₂⁻, NpO₂²⁺) using two- or four-component Hamiltonians (Jerzyk et al., 19 Nov 2025, Wang et al., 2023).
Basis set extrapolation and corrections: SHCI, in combination with range-separated density functional corrections (PBE-UEG and PBE-OT), achieves near–CBS-limit energies at reduced computational cost (Yao et al., 2021).
Excited states and near-degenerate systems: Simultaneous state-averaged SHCI for near-degenerate ions ensures correct treatment of ground and excited states (Jerzyk et al., 19 Nov 2025, Holmes et al., 2017).

Typical outcomes include chemical accuracy (errors <1 kcal/mol or 1.6 mHa) across broad datasets and sub-mHa precision for energy differences with respect to experiment and FCI (Yao et al., 2021, Yao et al., 2020).

6. Orbital Optimization and Acceleration Strategies

For improved convergence and reduction of the variational space size, SHCI employs orbital optimization. Approaches are classified as uncoupled, fully coupled, or quasi-fully coupled; accelerated diagonal Newton and BFGS methods are recommended for robust convergence (Yao et al., 2021). Empirically, orbital optimization reduces the number of variational determinants needed to achieve a given accuracy by factors of two or more.

The orbital optimization step involves parameterizing orbital rotations,

$E(\mathbf X)\;=\;\langle\Psi_V|\mathrm e^{\hat X}\,\hat H\,\mathrm e^{-\hat X}|\Psi_V\rangle,$

where $\hat X$ is anti-Hermitian and the methods may utilize both gradient and Hessian information.

This approach significantly lowers both computational cost and statistical uncertainty, particularly in systems requiring large active spaces for adequate correlation description (Yao et al., 2021, Yao et al., 2020, Jerzyk et al., 19 Nov 2025).

7. Extensions, Limitations, and Future Directions

SHCI is extensible to relativistic Hamiltonians by using complex-valued integrals and spinor bases. The heat-bath selection rule and semistochastic PT2 remain valid for two- and four-component forms, enabling direct treatment of spin–orbit coupling and relativistic effects in heavy elements (Wang et al., 2023).

While SHCI approaches the FCI limit for fixed basis sets, convergence to the complete basis set (CBS) limit remains basis set limited. Range-separated DFT corrections are used to "patch" residual basis incompleteness, accelerating CBS convergence (Yao et al., 2021).

Known limitations include scaling with system size (exponential in the number of correlated electrons/orbitals), though the practical prefactor is dramatically reduced relative to FCI. Near-degeneracies, for example in low-lying states of heavy atoms, require state-averaged treatments that increase cost but are manageable in the SHCI workflow (Jerzyk et al., 19 Nov 2025). Effective-core-potentials and pseudopotential approximations are beneficial for reducing computational burden with minimal impact on accuracy for chemically inert core electrons.

SHCI is now capable of providing FCI-quality results for a wide class of molecules and materials of interest to quantum chemistry, transition metal chemistry, and fusion plasma modeling (Jerzyk et al., 19 Nov 2025, Yao et al., 2021, Wang et al., 2023, Yao et al., 2020).

Table: SHCI Algorithmic Steps and Key Features

Stage	Main Operation	Computational Benefit
Determinant selection	Heat-bath criterion $\|H_{ai}c_i\|\ge \epsilon_1$	Only significant determinants considered
Variational Diagonalization	Davidson in sparse $V$	Efficient for large sparse matrices
PT2 Screening	$\|H_{ai}c_i\| \ge \epsilon_2$	Orders-of-magnitude reduction in cost
Semistochastic PT2	3-step (deterministic, pseudo, stochastic)	Unbiased, memory efficient, controllable error
Extrapolation	Fit $E^{\rm SHCI}(-\Delta E^{(2)})$	Systematic FCI limit approaching
Orbital Optimization	Accelerated Newton/BFGS schemes	Reduces $N_{\rm det}$ , accelerates convergence