DARUANs: Quantum Data Re-Uploading Activations

Updated 24 September 2025

DARUANs are quantum variational activation functions that use repeated single-qubit data re-uploading to achieve universal function approximation with exponential parameter efficiency.
They exponentially amplify frequency spectra via weighted data encoding, which drastically reduces the number of trainable parameters compared to classical Fourier methods.
DARUANs are integrated into neural architectures like Kolmogorov–Arnold Networks, enhancing regression and classification tasks while being optimized for NISQ devices.

DatA Re-Uploading ActivatioNs (DARUANs) are quantum variational activation functions built upon single-qubit data re-uploading circuits. They provide a scalable, expressive, and parameter-efficient framework for quantum (and quantum-inspired) deep learning, most notably as nonlinearities inside Kolmogorov–Arnold Networks (KANs) and other modern neural architectures. DARUANs harness the exponentially growing spectral capacity of repeated and weighted data encoding in quantum circuits, thus realising universal approximation properties, exponential compression relative to classical Fourier-based activations, and improved learning dynamics for both regression and classification tasks. Their design is directly motivated by constraints and opportunities in noisy intermediate-scale quantum (NISQ) devices and classical quantum circuit simulators.

1. Quantum Variational Activation Functions via Data Re-Uploading

The foundational insight of DARUANs is the use of single-qubit quantum circuits in which classical input data are repeatedly “re-uploaded,” interleaved with trainable variational unitaries. A DARUAN computes an activation function $\phi_\theta(x)$ as the expectation value of an observable after a parametrized data-processing circuit:

$\phi_\theta(x) = \langle 0| U(x; \theta)^\dagger \mathcal{M} U(x; \theta) |0\rangle,$

where $U(x; \theta)$ is an SU(2) variational unitary dependent on both $x$ and trainable weights $\theta$ , and $\mathcal{M}$ is typically a Pauli operator (e.g., $\sigma_z$ ). The circuit is constructed by alternating layers that encode $x$ (via single-qubit rotations, often with trainable weights in the angle) and trainable, data-independent single-qubit gates:

$U(x) = W^{(r+1)} \prod_{\ell=1}^r \left[S(w_\ell x) W^{(\ell)}\right],$

where $S(w_\ell x) = \exp(-i w_\ell x H)$ is a weighted data encoding with generator $H$ (typically a Pauli matrix), $w_\ell$ are trainable weights, and $W^{(\ell)}$ are generic single-qubit unitaries. The number of repetitions $r$ sets the nonlinearity and frequency richness of the activation.

The critical aspect distinguishing DARUANs from classical activation function constructions (such as B-splines or truncated Fourier sums) is this quantum-native data re-uploading mechanism. The output of a DARUAN can be interpreted as a highly parameterized trigonometric polynomial whose frequency components are exponentially enhanced by the layered, weighted encoding structure (Jiang et al., 17 Sep 2025).

2. Exponential Frequency Spectra and Universal Approximation

Data re-uploading with trainable weights in the encoding enables an exponential increase in the number of accessible frequency components. For $r$ re-upload blocks with independent weights $\{w_\ell\}$ , the set of available frequencies is

$\Omega_B = \left\{ \sum_{\ell=1}^r m_\ell w_\ell \mid m_\ell \in \{-1, 0, 1\} \right\},$

which grants $K_B = 2^r - 1$ maximum frequency with optimal geometric weight choices ( $w_\ell = 2^{\ell-1}$ ), yielding a trigonometric basis of $(3^r - 1)$ nontrivial frequencies.

This expands the spectral capacity of the activation function exponentially with $r$ , in contrast to the linear scaling without weighting, and allows for exponentially improved parameter efficiency. To approximate a $C^m$ target function to error $\varepsilon$ in the $C^m$ -norm, classical Fourier-based activations require parameter counts scaling as $\Theta(\varepsilon^{-1/(k+1-m)})$ , whereas DARUANs require only $\Theta(\log(1/\varepsilon))$ parameters. Thus, DARUANs reach a given approximation error with exponentially fewer trainable weights, a significant advantage when embedding nonlinearities within large-scale quantum or quantum-inspired neural architectures (Jiang et al., 17 Sep 2025).

Furthermore, repeated data re-uploading yields a universal approximator even with a single qubit; the expectation value of such a circuit can approximate any real-valued function on compact domains to arbitrary precision, subject to the expressivity permitted by the circuit depth and parameterization (Pérez-Salinas et al., 2019).

3. Integration into Kolmogorov–Arnold Networks and Neural Architectures

When inserted as nonlinear activation modules in Kolmogorov–Arnold Networks (KANs), each neural edge becomes a learnable quantum variational activation (the DARUAN). The canonical KAN layer equation is

$x_{l+1, j} = \sum_i \phi_{l, j, i}(x_{l, i}),$

and, with DARUAN activations,

$\phi_{l, j, i}(x) = \langle 0| U(x; \theta_{l, j, i})^\dagger \mathcal{M} U(x; \theta_{l, j, i})|0\rangle.$

This QKAN (quantum-inspired KAN) architecture unifies the interpretability of KAN models with the parameter efficiency and enhanced expressivity of quantum variational activation functions. When applied as a direct replacement for classical MLPs or as feed-forward layers in convolutional or transformer neural networks, QKAN modules maintain generalization performance while drastically reducing memory and parameter requirements. Empirical results demonstrate state-of-the-art accuracy across noise-robust regression, image classification (MNIST, CIFAR-10/100), and even autoregressive generative language modeling tasks—all while using significantly fewer trainable parameters relative to MLP or Fourier-KAN baselines (Jiang et al., 17 Sep 2025).

Two techniques introduced in the QKAN framework further optimize scalability:

Layer Extension: Incrementally increases the DARUAN circuit depth during or after training, reusing earlier weights to avoid disrupting the learned representation.
Hybrid QKANs (HQKANs): Interleave autoencoder-style bottlenecks with DARUAN-based layers, mitigating quadratic parameter growth for high-dimensional inputs and outputs.

4. Expressivity, Regularization, and Trainability

The frequency profile of the learned function in a deep re-uploading model forms a Gaussian envelope due to the convolution of the spectral distributions from each layer. As the number of re-uploaded layers $L$ increases, the total support in frequency space grows linearly, but only $O(\sqrt{L})$ components are significant (i.e., have non-negligible amplitudes) (Barthe et al., 2023). Consequently, derivatives of the learned function are upper-bounded (the average Lipschitz constant scales as $O(\sqrt{L})$ ), which regularizes the hypothesis class and suppresses overfitting to high-frequency noise.

Trainability is assessed via gradient statistics. While large, deep circuits in generic PQCs suffer from the barren plateau phenomenon, DARUANs avoid this in part by their architectural structure and by keeping the so-called “absorption witness” [Editor's term, (Barthe et al., 2023)]—a measure of how data encoding perturbs the gradient variance—small. This ensures that the magnitude and variance of gradients remain substantial throughout training, a property empirically observed even as the number of qubits increases or in reinforcement learning settings with non-stationary targets (Coelho et al., 21 Jan 2024).

5. Physical Realizations and Quantum-Classical Applications

DARUANs and their data re-uploading foundations have been experimentally realized on diverse platforms, including integrated photonic processors (Mauser et al., 7 Jul 2025), superconducting transmon quantum simulators (Tolstobrov et al., 2023), and silicon optical circuits (Ono et al., 2022). These implementations demonstrate:

Resource efficiency: Universal classification with a single physical qubit or mode due to the expressivity of repeated data encoding.
Scalability: Easily mapped onto NISQ-hardware via single-qubit (or mode) circuits without the need for deep entanglement.
Robustness: Flat loss landscapes and finite VC dimension when separating the data encoding and processing gates, conferring improved trainability and necessary limitations on model complexity for generalization.

DARUAN-inspired architectures are further deployed in hybrid models for time series forecasting, reinforcement learning, and particle physics applications (Schetakis et al., 22 Jan 2025, Cassé et al., 16 Dec 2024, Wang et al., 24 May 2025), often achieving competitive accuracy and superior parameter/data efficiency by leveraging incremental, cyclic, or strategic re-uploading schemes.

6. Design Trade-Offs and Limitations

Recent investigations reveal a fundamental limitation in overly deep data re-uploading architectures: the signal distinguishing the quantum states degrades exponentially with encoding depth when acting on limited qubit registers and high-dimensional data—formally, the quantum state approaches the maximally mixed state and predictive power collapses to random guessing (Wang et al., 24 May 2025). This effect is characterized by a tight bound on the Petz–Rényi-2 divergence between the state and the maximally mixed state, and is insensitive to repetition of re-uploading layers.

Therefore, for practical DARUAN deployments, “wider” circuits (increased qubit counts distributing data across fewer encoding layers) must be favored over deeper, narrow circuits. This architectural choice preserves signal and predictive power on unseen data, especially in high-dimensional settings.

7. Future Directions and Outlook

DARUANs, as quantum variational activation functions, serve as a robust substrate for highly expressive, interpretable, and efficient AI models directly compatible with both quantum hardware and classical quantum simulators. Potential avenues include:

Deeper integration within hybrid and classical neural architectures by leveraging exponential frequency amplification and refined parameterization strategies.
Adaptive training and layer extension protocols to balance expressivity, generalization, and resource constraints.
Exploration of alternative data encoding schemes, operator choices, and circuit ansätze to further optimize the tradeoff between regularization and expressivity based on application requirements.
Systematic investigation of statistical learning properties, VC dimension, and loss landscapes in photonic implementations to facilitate resource-efficient and robust QML.

These directions indicate that DARUANs and related architectures may play a central role in the convergence of quantum and classical machine learning, both as a practical computational tool and as a theoretical bridge between quantum circuits and deep neural network design.