ClusterVQE: Hybrid Quantum Clustering

Updated 5 February 2026

ClusterVQE is a hybrid quantum algorithm that integrates variational quantum eigensolvers with clustering techniques to optimize unsupervised learning and molecular simulations.
It employs both data-driven clustering methods and qubit-partitioned circuits to reduce quantum resource requirements while maintaining high accuracy.
Numerical benchmarks on datasets like Iris and molecular simulations of LiH demonstrate ClusterVQE’s efficiency, scalability, and precision on resource-limited devices.

ClusterVQE refers to a family of quantum algorithms and quantum-inspired hybrid schemes leveraging the variational quantum eigensolver (VQE) in combination with data or qubit clustering to efficiently solve unsupervised learning and quantum chemistry problems on resource-limited hardware. These approaches employ iterative minimization of a cost Hamiltonian reflecting the target partitioning or molecular ground state, while exploiting cluster-based decompositions—of datapoints, qubits, or operator structure—to reduce circuit or device requirements. Supported by a range of implementations, ClusterVQE encompasses both data-driven clustering (e.g., k-means, MaxCut) and physics-motivated qubit-clustering for electronic structure (Bermejo et al., 2022, Zhang et al., 2021, Ghasempouri et al., 2023, Yung et al., 2023, Yogendran et al., 2024, Beaulieu et al., 2021, Romero et al., 2017).

1. Mathematical Formalism and Problem Mappings

ClusterVQE attempts to encode a global optimization—such as clustering $N$ datapoints into $k$ classes, or finding the ground state of an $N$ -qubit molecular Hamiltonian—into a quantum or quantum-inspired cost function amenable to variational simulation.

Data Clustering via Hamiltonians

Given feature vectors $\vec{x}_i \in \mathbb{R}^m$ and $k$ clusters, the classical assignment cost is typically expressed as

$H_{\rm classical} = \frac{1}{2} \sum_{i,j=1}^N d(\vec{x}_i, \vec{x}_j) \sum_{a=1}^k q_i^a q_j^a$

with $\sum_{a=1}^k q_i^a = 1$ enforcing the one-hot constraint. Mapped to a quantum setting, this becomes

$H_c = \sum_{i<j} \sum_{a=1}^k d(\vec{x}_i, \vec{x}_j) \ |\psi^a\rangle\langle\psi^a|_i \otimes |\psi^a\rangle\langle\psi^a|_j$

where $\{|\psi^a\rangle\}$ are maximally-orthogonal label reference states in the $2^n$ -dimensional Hilbert space (Bermejo et al., 2022).

For binary clustering (MAXCUT), the assignment reduces to an Ising Hamiltonian,

$H_{\mathrm{cut}} = \sum_{i<j} w_{ij} \frac{1-Z_i Z_j}{2}$

with $w_{ij}$ a weighted adjacency or distance matrix and $Z_i$ the Pauli $Z$ operator (Beaulieu et al., 2021, Yung et al., 2023, Yogendran et al., 2024).

Quantum Chemistry with Clustered Qubits

In molecular VQE, the unitary coupled cluster (UCC) ansatz or variants is employed. ClusterVQE partitions the $N$ -qubit system into $M$ clusters ${\mathcal C}_p$ , based on quantum mutual information $I_{ij}$ to maximize intra-cluster entanglement and minimize inter-cluster coupling (Zhang et al., 2021). The wavefunction is represented as a tensor product of cluster states, and the full Hamiltonian is "dressed" to capture inter-cluster correlations: $H_d = \Bigl(\prod_{p\neq q} U_{pq}^\dagger\Bigr) H \Bigl(\prod_{p\neq q} U_{pq}\Bigr)$ with $U_{pq}$ unitaries connecting clusters $p$ and $q$ .

2. Quantum Circuit Architectures and Computational Strategies

Data Mapping and Label-State Encoding

Data vectors are mapped to quantum states via feature map circuits, such as sequences of $R_z$ , $R_y$ rotations, distributing features over $2n$ rotation angles for $n$ qubits. Labels are not restricted to computational basis vectors; rather, one uses maximally mutually orthogonal states, e.g., the four Bloch sphere tetrahedron vertices for up to $k=4$ clusters with $n=1$ qubit (Bermejo et al., 2022).

Clustered Circuit Schemes

ClusterVQE reduces circuit depth and device width by distributing the problem over smaller subcircuits:

In quantum chemistry, each cluster simulation employs a shallower ansatz (e.g., hardware-efficient, UCC, or modular cluster circuits), avoiding large-width entanglers and limiting inter-cluster unitaries to classical pre/post-processing (Zhang et al., 2021, Ghasempouri et al., 2023).
In data clustering, circuits prepare the parameterized trial state, with data-mapped rotations followed by a layer of CNOTs for entanglement. For coreset-based approaches, the number of qubits $m$ matches the coreset size (Yogendran et al., 2024, Yung et al., 2023).

3. Workflow: Optimization and Measurement

The variational workflow common to all implementations proceeds as:

Hamiltonian Construction: Encode cost function as a qubit Hamiltonian (clustering cost, MAXCUT, UCC or dressed Hamiltonian).
Ansatz Initialization: Construct parameterized trial circuits (e.g., hardware-efficient, modular cluster, UCCSD).
Classical Optimization: Apply optimizers such as Adam, SPSA, COBYLA, or SLSQP to minimize energy expectation $\langle H \rangle$ with respect to the variational parameters.
Energy and Gradient Measurement: Measure and average Pauli-term expectations via shot-based sampling; analytic gradient computation is enabled through dressed operators or Hadamard-test–like subcircuits (Romero et al., 2017, Zhang et al., 2021).
Cluster Assignment or State Reconstruction: For clustering, assign classes via maximal label-state fidelity or output bitstring; for chemistry, extract optimized waveform and ground-state energy.

Optionally, error mitigation is applied via zero-noise extrapolation or Clifford-data regression (Bermejo et al., 2022, Zhang et al., 2021).

4. Quantum-Inspired and Coreset Techniques

Classical simulability and data size reduction are key to making ClusterVQE practical on current devices:

Tensor Network Simulation: In (Bermejo et al., 2022), the variational circuit is chosen so that the state of each data point is a Matrix Product State (MPS) of bond dimension $\chi \leq 2$ , enabling efficient gradient-based or imaginary-time optimization entirely classically ("quantum-inspired clustering").
Coreset Reduction: To handle large datasets, a classical $\epsilon$ -coreset (weighted data subset) is constructed, and only the m-point coreset is mapped to a quantum Hamiltonian (Yogendran et al., 2024, Yung et al., 2023). Quantum-tailored coresets, such as the deterministic "Contour coreset," are specifically designed to avoid missing minority clusters in small $m$ (Yung et al., 2023).
Resource Scaling: With clustering, the quantum resource load (qubits, circuit depth) scales with coreset cardinality, permitting larger classical datasets to be processed by small quantum devices.

5. Numerical Results and Empirical Performance

Benchmarks in Data Clustering

Clustering Accuracy: On the Iris dataset (n=1 qubit, k=3 clusters), ClusterVQE achieves 96% accuracy in 15–20 epochs using the Adam optimizer (Bermejo et al., 2022).
Coreset-Based VQE: The VQE+Contour coreset approach shows superior average accuracy and lower standard deviation compared to QAOA+generic coreset for Iris, Wine, Breast Cancer, and uneven synthetic benchmarks (e.g., Iris: 0.914±0.012 vs 0.896±0.035) (Yung et al., 2023).
Hardware Demonstrations: For 5–6 point clusters on IBM Q hardware, VQE approaches optimal solutions but occasionally fails to find the ground state, while warm-start QAOA produces exact results in shorter wall-clock time (Beaulieu et al., 2021).

Quantum Chemistry

Ground-State Energies: LiH and N₂ simulations indicate that ClusterVQE matches the convergence of qubit-ADAPT-VQE but with fewer qubits, substantially reduced depth, and slower Pauli growth than iQCC (Zhang et al., 2021).
Efficient Modular Circuits: The modular 2-qubit cluster circuitry constructs $n$ -qubit ansätze with depth $O(n)$ (e.g., depth 13–17 for 6 qubits vs. $\sim$ 100 for ADAPT-VQE/UCCSD), attaining chemical accuracy ( $\Delta E_c < 10^{-3}$ ) on small molecules and Ising models (Ghasempouri et al., 2023).
Resource Estimates: Use of MP2 screening and active space truncation cuts parameter and measurement budgets by over an order of magnitude with no significant loss in target accuracy (Romero et al., 2017).

6. Extensions, Limitations, and Future Directions

ClusterVQE techniques are extensible across both quantum unsupervised learning and correlated-electron physics:

Scalability: Scaling beyond $k=2$ clusters for data is possible using general position label states or multi-qubit encodings, with qubit number scaling as $k$ or $k \log N$ , and coresets providing a tractable route for large $N$ (Yogendran et al., 2024).
Measurement Overhead: For coreset-based clustering, Hamiltonian term count is $O(m^2)$ ; mitigation via Pauli grouping and shadow tomography is suggested for near-term devices.
Unsolved Theoretical Problems: For some quantum-tailored coresets, worst-case approximation guarantees are not yet established (Yung et al., 2023). Clustering for $k > 2$ with qutrits or higher-dimensional encodings is open.
Noise and Error Mitigation: Shallow, parallelizable circuits and error mitigation protocols are critical for NISQ device feasibility (Bermejo et al., 2022, Zhang et al., 2021, Ghasempouri et al., 2023).
Hybrid Adaptive Methods: Integration of adaptive ansatz growth (e.g., ADAPT-VQE) and natural-gradient preconditioning may alleviate barren plateaus and accelerate convergence in high-dimensional parameter spaces (Yogendran et al., 2024).

7. Comparison of ClusterVQE Implementations

Variant	Application Domain	Quantum Resource Scaling	Classical Preprocessing
Data-driven ClusterVQE (Bermejo et al., 2022)	Clustering (k ≥ 2)	$n = \log_2 k$ qubits	Precompute distances, label states
Coreset-VQE (Yung et al., 2023, Yogendran et al., 2024)	Clustering, Big Data	$m$ qubits (coreset)	Coreset construction, Hamiltonian assembly
Qubit-ClusterVQE (Zhang et al., 2021)	Quantum chemistry	$N/M$ qubits (per cluster)	Mutual information, graph partitioning, Hamiltonian dressing
Modular Cluster Circuits (Ghasempouri et al., 2023)	Quantum chemistry/TFIM	$n$ qubits, depth $O(n)$	Valence bond tiling, circuit templating

The ClusterVQE paradigm enables variational quantum and quantum-inspired algorithms for clustering and correlated systems to be applied efficiently on near-term hardware. This is achieved through hybridization of classical clustering (in data or qubit space), local circuit designs, and cost function strategies, as well as through data reduction (coresets) and tensor network simulation (Bermejo et al., 2022, Zhang et al., 2021, Ghasempouri et al., 2023, Yung et al., 2023, Yogendran et al., 2024, Beaulieu et al., 2021, Romero et al., 2017).

Markdown Upgrade to Chat

References (7)

Variational Quantum and Quantum-Inspired Clustering (2022)

Variational Quantum Eigensolver with Reduced Circuit Complexity (2021)

Modular Cluster Circuits for the Variational Quantum Eigensolver (2023)

Clustering by Contour coreset and variational quantum eigensolver (2023)

Big data applications on small quantum computers (2024)

Evaluating performance of hybrid quantum optimization algorithms for MAXCUT Clustering using IBM runtime environment (2021)

Strategies for quantum computing molecular energies using the unitary coupled cluster ansatz (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to ClusterVQE.