Bernoulli Micro-Kernel for Quantum PDE Sampling

Updated 23 November 2025

Bernoulli micro-kernel is a quantum computational primitive that performs explicit stencil node updates in finite-difference PDE solvers using shallow, single-qubit circuits.
It leverages constant-resource Monte Carlo sampling to estimate convex-sum stencil updates, ensuring unbiased estimators with error convergence of O(1/√M).
Empirical evaluations on simulators and NISQ devices demonstrate its scalability, lower bias, and improved accuracy compared to deeper, entangling alternatives.

The Bernoulli micro-kernel is a quantum computational primitive designed to perform explicit stencil node updates arising in finite-difference solvers for Partial Differential Equations (PDEs). In this context, it serves as a localized, constant-resource Monte Carlo subroutine—implementable via shallow, single-qubit quantum circuits—for accelerating the sampling of convex-sum stencil updates. Its resource cost in qubits and circuit depth does not scale with the problem size, rendering it suitable for orchestrated, massively parallel applications over computational grids. The Bernoulli micro-kernel is a realization of the broader QPU micro-kernel concept, wherein a quantum processor (QPU) acts as a sampling accelerator, invoked by a classical host that maintains the outer iteration structure (Markidis et al., 16 Nov 2025).

1. Stencil Computation Framework and the Micro-Kernel Paradigm

Explicit finite-difference PDE solvers, such as the Forward-Time Centered-Space (FTCS) method for the 1D Heat equation, update the value at each spatial node $i$ at time $n+1$ according to a convex combination of neighbor values at time $n$ :

$u_i^{n+1} = w_L\,u_{i-1}^n + w_C\,u_i^n + w_R\,u_{i+1}^n, \qquad w_L + w_C + w_R = 1, \quad w_b\ge0.$

In the QPU micro-kernel framework, the classical host iterates over time steps $n$ and nodes $i$ , invoking the quantum micro-kernel only to obtain unbiased Monte Carlo estimates of the stencil update for each node. This approach offloads the local convex-sum operation to the quantum device, leaving the global time loop and grid iteration on the classical host (Markidis et al., 16 Nov 2025).

2. Bernoulli Micro-Kernel Circuit Structure

The Bernoulli micro-kernel operates via single-qubit circuits for each stencil branch $b\in\{L,C,R\}$ :

The qubit is initialized in $\ket{0}$ .
A single-qubit $R_y$ rotation is applied with angle $\theta(u_b) = 2\arcsin\sqrt{u_b'}$ , where $u_b'$ is the affine-normalized neighbor value mapped to $[0,1]$ .
The qubit is measured in the computational basis, yielding a Bernoulli sample with $\Pr(\text{outcome }=1)=u_b'$ .
This process is repeated $M_b$ times per branch to obtain an empirical mean $\hat{u}_b$ .

No entanglement or multi-qubit operations are involved; each branch is executed independently (Markidis et al., 16 Nov 2025).

3. Data Encoding and Shot Allocation

Neighbor values $u_b$ originally in $[u_{\min}, u_{\max}]$ are linearly normalized to $u_b'$ in $[0,1]$ :

$u_b' = \frac{u_b - u_{\min}}{u_{\max}-u_{\min}} \in [0,1].$

Given a per-node shot budget $M$ , shots are allocated proportionally to the stencil weights:

$M_b = \lfloor w_b\, M \rceil, \quad \sum_b M_b = M.$

Each branch executes its $R_y$ circuit $M_b$ times, enabling shot-based statistical estimation respecting the convex weights (Markidis et al., 16 Nov 2025).

4. Estimator Construction and Statistical Properties

Let $X_b^{(s)}\in\{0,1\}$ be the outcome of the $s$ -th measurement for branch $b$ . The convex-sum estimator for $u_i^{n+1}$ is

$\hat{u}_i^{n+1} = \frac{1}{M}\sum_{b\in\{L,C,R\}} \sum_{s=1}^{M_b} X_b^{(s)} = \sum_{b\in\{L,C,R\}} \frac{M_b}{M}\, \hat{u}_b,$

where $\hat{u}_b = \frac{1}{M_b}\sum_{s} X_b^{(s)}$ .

The estimator is unbiased:

$\mathbb{E}[\hat{u}_i^{n+1}] \approx \sum_b w_b\, u_b'$

and its variance is

$\mathrm{Var}[\hat{u}_i^{n+1}] = \frac{1}{M}\sum_b w_b u_b'(1-u_b') \leq \frac{1}{4M}.$

The standard error vanishes as $\mathcal{O}(1/\sqrt{M})$ (Markidis et al., 16 Nov 2025).

5. Resource Requirements and Scaling

The Bernoulli micro-kernel achieves resource independence from grid size:

Qubit count per branch: 1 qubit.
Circuit depth per shot: one $R_y$ gate plus measurement.
No entanglement between qubits; no increase in circuit complexity with additional grid points.

This constancy makes micro-kernels amenable to large-scale grid parallelization, with classical orchestration handling all node and branch-level iteration (Markidis et al., 16 Nov 2025).

6. Error Behavior and Convergence

The standard error for each branch's mean estimator is

$\mathrm{SE}(\hat{u}_b) = \sqrt{\frac{u_b'(1-u_b')}{M_b}} \leq \frac{1}{2\sqrt{M_b}},$

and propagates through the convex-sum to

$\mathrm{SE}(\hat{u}_i^{n+1}) \leq \frac{1}{2\sqrt{M}}.$

Empirical studies using noiseless simulators confirm the $O(1/\sqrt{M})$ convergence: doubling $M$ reduces the estimator noise by approximately $\sqrt{2}$ (Markidis et al., 16 Nov 2025).

7. Empirical Evaluation on Simulators and Quantum Hardware

Benchmarks were conducted for both the Heat and viscous Burgers’ equations:

Hardware	Circuit Depth	Gates	Errors ( $L_\infty, L_2$ )	Per-Node Wall Time
Simulator	1	1 × $R_y$	$O(M^{-1/2})$	Not reported
IBM Brisbane	3	1 × $R_y$ , 1 × X	0.0848, 0.0368 (raw)	≈ 4.7 s (M=4000)
			0.0756, 0.0378 (mitigated)

On IBM Brisbane, the Bernoulli micro-kernel with $M=4000$ shots per node achieved $L_\infty=0.0848$ , $L_2=0.0368$ without readout mitigation and $L_\infty=0.0756$ , $L_2=0.0378$ after applying single-qubit readout calibration. Circuit depth after transpilation was 3, with no two-qubit gates, and per-node wall time was approximately 4.7 s. In contrast, the branching micro-kernel exhibited higher error and deeper, more resource-intensive circuits (Markidis et al., 16 Nov 2025).

The results demonstrate that on present-day NISQ devices, the shallow, single-qubit Bernoulli micro-kernel consistently yields lower bias and higher accuracy relative to deeper, entangling alternatives, which are more susceptible to device noise (Markidis et al., 16 Nov 2025).

Markdown Upgrade to Chat

References (1)

QPU Micro-Kernels for Stencil Computation (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bernoulli Micro-Kernel.