Quantum Neuron Born Machines (QNBMs)

Updated 14 March 2026

Quantum Neuron Born Machines (QNBMs) are quantum generative models that embed neuron-style nonlinear activations via mid-circuit RUS measurements to enhance expressivity.
They utilize a repeat-until-success subroutine to implement controlled non-linearity, enabling precise discrete distribution learning and robust variational Bayesian inference.
Experimental and theoretical results show that QNBMs outperform standard QCBMs on constrained tasks and provide insights into scalable quantum circuit design.

Quantum Neuron Born Machines (QNBMs) are quantum generative models that incorporate neuron-style non-linear activations within quantum circuits. Building on the Born machine framework, QNBMs leverage mid-circuit measurements and classical feedback to implement quantum analogues of feed-forward neural networks. This architecture enables them to generate discrete distributions with enhanced expressivity, facilitating applications in quantum generative modeling and variational Bayesian inference, especially for binary neural networks. Their design offers a systematic way to embed non-linear dynamics into quantum machine learning models, with implications for both algorithmic performance and hardware stress-testing.

1. Theoretical Foundations and Model Structure

QNBMs are motivated by the limitations of linear quantum dynamics for generative modeling. In contrast to standard Quantum Circuit Born Machines (QCBMs), which are restricted to linear unitary evolution, QNBMs employ a quantum neuron subroutine that introduces controlled non-linear activation. Each quantum neuron is implemented via a repeat-until-success (RUS) mid-circuit measurement block, mediating a non-linear transformation on its output qubit conditioned on a weighted sum of input qubits.

Given an input register $|x_{\mathrm{in}}\rangle=|x_1\cdots x_n\rangle$ , an output qubit $|\psi_{\mathrm{out}}\rangle$ , and an ancilla $|0_a\rangle$ , the RUS subroutine computes $\theta = \sum_{i=1}^n w_i x_i + b$ and applies the non-linear activation: $q(\theta) = \arctan\left[\tan^2(\theta)\right], \quad p(\theta) = \Pr(\mathrm{ancilla}=0)>1/2.$ On successful measurement outcome, the operation $R_Y(2q(\theta))$ is applied to $|\psi_{\mathrm{out}}\rangle$ ; otherwise, corrective operations and repeats are performed until success, guaranteeing finite expected depth for each neuron. For a network of $N_{\mathrm{out}}$ output neurons, the joint probability over output bit-strings is defined by the final quantum state via the Born rule: $P_{\mathrm{model}}(x) = |\langle x|U_{\mathrm{QNBM}}(\{w,b\})|\psi_{\mathrm{in}}\rangle|^2,$ where $|\psi_{\mathrm{in}}\rangle$ is typically a superposition state, and $U_{\mathrm{QNBM}}$ is the product of layerwise neuron subroutines (Gili et al., 2022, Siddiqui et al., 2024).

2. Non-Linear Activation via Quantum Neuron Subroutine

The core technical innovation of QNBMs is the non-linear quantum neuron. Unlike standard PQCs, which are limited to linear function compositions, the RUS subroutine enables a controlled non-linear response at each layer. After parameterizing the weights $w_i \in (-1,1)$ and biases $b \in (-1,1)$ , the subroutine (see Fig. 1 in (Gili et al., 2022, Siddiqui et al., 2024)) applies controlled $R_Y(\pi/2)$ rotations from each input qubit to the ancilla and a bias rotation. The mid-circuit measurement on the ancilla projects the system into either the success or failure branch, with classical control determining whether to proceed or reset and repeat.

The expected number of repeats is finite (bounded by 7 in empirical tests) due to $p(\theta) > 1/2$ . Importantly, when the input register is in a superposition, the RUS module applies the non-linear activation coherently to each branch, up to small amplitude deformations $F_i$ (Gili et al., 2022).

3. Learning and Training Algorithms

QNBM parameters are typically trained to minimize the Kullback-Leibler (KL) divergence between the model and a target distribution: $\mathcal{L}(\{w,b\}) = \sum_x P_{\mathrm{target}}(x)\,\ln\left(\frac{P_{\mathrm{target}}(x)}{\max(P_{\mathrm{model}}(x),\epsilon)}\right),$ where $\epsilon$ is a small cutoff (e.g., $10^{-16}$ ). Gradient-based optimization is performed using a finite-difference (parameter-shift) rule for tunable parameters. Because the RUS structure introduces mid-circuit measurements and classical control, analytic parameter-shift rules for the full unitary are not always available; therefore, gradients are approximated using finite differences and back-propagation through the classical-quantum pipeline (Gili et al., 2022, Siddiqui et al., 2024). Adam optimizers and other standard stochastic gradient algorithms are used, typically with learning rates $\alpha=0.2$ , $\epsilon=0.1$ .

For variational inference in Bayesian binary neural networks (BNNs), QNBMs are leveraged as implicit variational posteriors over binary weights. The variational objective for the Evidence Lower Bound (ELBO) takes the form: $\mathrm{ELBO}(\phi) = \mathbb{E}_{w \sim q_{\phi}}[\log p(\mathcal{D}|w)] - \mathrm{KL}(q_{\phi}(w)\|p(w)),$ with the quantum circuit implementing $q_{\phi}(w)$ by sampling output bit-strings. Monte Carlo estimates, density-ratio estimation (via a binary classifier), and the parameter-shift gradient rule for PQCs are used to enable differentiable variational inference (Nikoloska et al., 2022).

4. Applications: Meta-Learning and Generative Modeling

As generative models, QNBMs are tested on tasks such as learning uniform or cardinality-constrained discrete distributions and as variational distributions for Bayesian BNNs. Notably, QNBMs systematically outperform standard QCBMs in precision and KL divergence on constrained distribution learning tasks, achieving nearly threefold reduction in error for challenging target distributions with similar gate and parameter counts (Gili et al., 2022).

In the context of Bayesian meta-learning, QNBMs define per-task variational posteriors over binary network weights, embedded within a bi-level optimization loop for fast adaptation. The algorithm consists of alternating inner-loop adaptation (task-specific parameter update via gradient descent on the negative ELBO) and outer-loop meta-optimization (hyper-gradient step for initialization parameters). This scheme, applied to synthetic regression tasks, achieves faster convergence and superior sample efficiency compared to joint or per-task learning strategies (Nikoloska et al., 2022).

5. Circuit Architecture and Resource Scaling

A QNBM implementing an $L$ -layer network with $(N_{\mathrm{in}}, N_{\mathrm{hid}}, N_{\mathrm{out}})$ structure requires $N_{\mathrm{in}} + N_{\mathrm{hid}} + N_{\mathrm{out}} + 1$ qubits, with a single ancilla reused across layers. Each neuron/block in layers beyond the input hosts an RUS subroutine with $O(n)$ controlled gates, one parameterized $R_Y$ rotation, and potential repeats. Gate count and depth scale linearly with both the number of output neurons and the maximum number of RUS trials. For large networks or BNNs with many binary weights, subsets of weights may be grouped or hybridized with classical distributions to mitigate quantum resource limitations (Nikoloska et al., 2022, Gili et al., 2022).

Resource requirements such as gate infidelities, single- and two-qubit errors, and mid-circuit measurement errors constrain the practically achievable circuit size. For instance, hardware implementations on the Quantinuum H1-1 device encounter bottlenecks at $N_{\mathrm{out}}\approx4$ due to classical register limits and software-compiler constraints before quantum gate errors become prohibitive (Siddiqui et al., 2024).

6. Experimental Performance and Hardware Evaluation

Numerical and hardware experiments validate QNBMs on several axes:

Distribution learning: QNBMs achieve KL divergence $\approx0.0157$ (QNBM) vs. $0.0058$ (QCBM) on uniform distributions, but outperform (lower error rate, higher precision $P\approx0.95$ ) QCBMs on cardinality-constrained targets. The linearized QNBM or deeper QCBM does not close this gap, confirming the advantage conferred by mid-circuit non-linearity (Gili et al., 2022).
Bayesian inference: QNBM-based meta-learning for BNNs achieves the ideal root mean square error (RMSE) in nearly 150 meta-iterations on synthetic regression compared to >200 for non-meta-learning approaches (Nikoloska et al., 2022).
Hardware stress testing: On the Quantinuum H1-1 platform, QNBMs are used to empirically probe hardware limits via incremental scaling of output neurons (and hence RUS subroutines), measuring breakdown in shot count, KL convergence, and resource consumption. The main limiting factors are not quantum noise but classical control infrastructure—register allocation, looping, and QDK (Quantum Development Kit) interoperability (Siddiqui et al., 2024).

Model size	Classical-control shots	Post-selection shots	KL
(1,0,2)	400	1,600	0.053
(2,0,3)	800	6,400	~0.2
(3,0,4)	1,600	25,600	~0.4

KL increases linearly with output layer size, but small-scale QNBMs reliably capture target distributions in hardware settings.

7. Broader Significance and Outlook

QNBMs represent a paradigm shift in quantum generative modeling by exploiting mid-circuit measurements and classical feedback to realize circuit-level non-linearity. This attribute, absent in standard PQCs or QCBMs, enhances both generative expressivity and training dynamics—thereby circumventing the limitations imposed by linear quantum mechanics on function composition and potential issues such as barren plateaus. A plausible implication is that hardware-constrained, RUS-based architectures may offer a scalable framework as quantum devices evolve, given hardware advances in mid-circuit measurement and classical-quantum control integration.

Future directions cited in the literature include development of analytic gradient rules for RUS-style circuits, extensibility to larger and structured tasks, refined compiler-classical control co-design, and further exploration of non-linear activations and deeper neuron-like motifs. Systematic benchmarks and hardware implementations will continue to inform both algorithmic and device-level design (Nikoloska et al., 2022, Gili et al., 2022, Siddiqui et al., 2024).