ClusterVQE: Hybrid Quantum Clustering
- ClusterVQE is a hybrid quantum algorithm that integrates variational quantum eigensolvers with clustering techniques to optimize unsupervised learning and molecular simulations.
- It employs both data-driven clustering methods and qubit-partitioned circuits to reduce quantum resource requirements while maintaining high accuracy.
- Numerical benchmarks on datasets like Iris and molecular simulations of LiH demonstrate ClusterVQE’s efficiency, scalability, and precision on resource-limited devices.
ClusterVQE refers to a family of quantum algorithms and quantum-inspired hybrid schemes leveraging the variational quantum eigensolver (VQE) in combination with data or qubit clustering to efficiently solve unsupervised learning and quantum chemistry problems on resource-limited hardware. These approaches employ iterative minimization of a cost Hamiltonian reflecting the target partitioning or molecular ground state, while exploiting cluster-based decompositions—of datapoints, qubits, or operator structure—to reduce circuit or device requirements. Supported by a range of implementations, ClusterVQE encompasses both data-driven clustering (e.g., k-means, MaxCut) and physics-motivated qubit-clustering for electronic structure (Bermejo et al., 2022, Zhang et al., 2021, Ghasempouri et al., 2023, Yung et al., 2023, Yogendran et al., 2024, Beaulieu et al., 2021, Romero et al., 2017).
1. Mathematical Formalism and Problem Mappings
ClusterVQE attempts to encode a global optimization—such as clustering datapoints into classes, or finding the ground state of an -qubit molecular Hamiltonian—into a quantum or quantum-inspired cost function amenable to variational simulation.
Data Clustering via Hamiltonians
Given feature vectors and clusters, the classical assignment cost is typically expressed as
with enforcing the one-hot constraint. Mapped to a quantum setting, this becomes
where are maximally-orthogonal label reference states in the -dimensional Hilbert space (Bermejo et al., 2022).
For binary clustering (MAXCUT), the assignment reduces to an Ising Hamiltonian,
with a weighted adjacency or distance matrix and the Pauli operator (Beaulieu et al., 2021, Yung et al., 2023, Yogendran et al., 2024).
Quantum Chemistry with Clustered Qubits
In molecular VQE, the unitary coupled cluster (UCC) ansatz or variants is employed. ClusterVQE partitions the -qubit system into clusters , based on quantum mutual information to maximize intra-cluster entanglement and minimize inter-cluster coupling (Zhang et al., 2021). The wavefunction is represented as a tensor product of cluster states, and the full Hamiltonian is "dressed" to capture inter-cluster correlations: with unitaries connecting clusters and .
2. Quantum Circuit Architectures and Computational Strategies
Data Mapping and Label-State Encoding
Data vectors are mapped to quantum states via feature map circuits, such as sequences of , rotations, distributing features over $2n$ rotation angles for qubits. Labels are not restricted to computational basis vectors; rather, one uses maximally mutually orthogonal states, e.g., the four Bloch sphere tetrahedron vertices for up to clusters with qubit (Bermejo et al., 2022).
Clustered Circuit Schemes
ClusterVQE reduces circuit depth and device width by distributing the problem over smaller subcircuits:
- In quantum chemistry, each cluster simulation employs a shallower ansatz (e.g., hardware-efficient, UCC, or modular cluster circuits), avoiding large-width entanglers and limiting inter-cluster unitaries to classical pre/post-processing (Zhang et al., 2021, Ghasempouri et al., 2023).
- In data clustering, circuits prepare the parameterized trial state, with data-mapped rotations followed by a layer of CNOTs for entanglement. For coreset-based approaches, the number of qubits matches the coreset size (Yogendran et al., 2024, Yung et al., 2023).
3. Workflow: Optimization and Measurement
The variational workflow common to all implementations proceeds as:
- Hamiltonian Construction: Encode cost function as a qubit Hamiltonian (clustering cost, MAXCUT, UCC or dressed Hamiltonian).
- Ansatz Initialization: Construct parameterized trial circuits (e.g., hardware-efficient, modular cluster, UCCSD).
- Classical Optimization: Apply optimizers such as Adam, SPSA, COBYLA, or SLSQP to minimize energy expectation with respect to the variational parameters.
- Energy and Gradient Measurement: Measure and average Pauli-term expectations via shot-based sampling; analytic gradient computation is enabled through dressed operators or Hadamard-test–like subcircuits (Romero et al., 2017, Zhang et al., 2021).
- Cluster Assignment or State Reconstruction: For clustering, assign classes via maximal label-state fidelity or output bitstring; for chemistry, extract optimized waveform and ground-state energy.
Optionally, error mitigation is applied via zero-noise extrapolation or Clifford-data regression (Bermejo et al., 2022, Zhang et al., 2021).
4. Quantum-Inspired and Coreset Techniques
Classical simulability and data size reduction are key to making ClusterVQE practical on current devices:
- Tensor Network Simulation: In (Bermejo et al., 2022), the variational circuit is chosen so that the state of each data point is a Matrix Product State (MPS) of bond dimension , enabling efficient gradient-based or imaginary-time optimization entirely classically ("quantum-inspired clustering").
- Coreset Reduction: To handle large datasets, a classical -coreset (weighted data subset) is constructed, and only the m-point coreset is mapped to a quantum Hamiltonian (Yogendran et al., 2024, Yung et al., 2023). Quantum-tailored coresets, such as the deterministic "Contour coreset," are specifically designed to avoid missing minority clusters in small (Yung et al., 2023).
- Resource Scaling: With clustering, the quantum resource load (qubits, circuit depth) scales with coreset cardinality, permitting larger classical datasets to be processed by small quantum devices.
5. Numerical Results and Empirical Performance
Benchmarks in Data Clustering
- Clustering Accuracy: On the Iris dataset (n=1 qubit, k=3 clusters), ClusterVQE achieves 96% accuracy in 15–20 epochs using the Adam optimizer (Bermejo et al., 2022).
- Coreset-Based VQE: The VQE+Contour coreset approach shows superior average accuracy and lower standard deviation compared to QAOA+generic coreset for Iris, Wine, Breast Cancer, and uneven synthetic benchmarks (e.g., Iris: 0.914±0.012 vs 0.896±0.035) (Yung et al., 2023).
- Hardware Demonstrations: For 5–6 point clusters on IBM Q hardware, VQE approaches optimal solutions but occasionally fails to find the ground state, while warm-start QAOA produces exact results in shorter wall-clock time (Beaulieu et al., 2021).
Quantum Chemistry
- Ground-State Energies: LiH and N₂ simulations indicate that ClusterVQE matches the convergence of qubit-ADAPT-VQE but with fewer qubits, substantially reduced depth, and slower Pauli growth than iQCC (Zhang et al., 2021).
- Efficient Modular Circuits: The modular 2-qubit cluster circuitry constructs -qubit ansätze with depth (e.g., depth 13–17 for 6 qubits vs. 100 for ADAPT-VQE/UCCSD), attaining chemical accuracy () on small molecules and Ising models (Ghasempouri et al., 2023).
- Resource Estimates: Use of MP2 screening and active space truncation cuts parameter and measurement budgets by over an order of magnitude with no significant loss in target accuracy (Romero et al., 2017).
6. Extensions, Limitations, and Future Directions
ClusterVQE techniques are extensible across both quantum unsupervised learning and correlated-electron physics:
- Scalability: Scaling beyond clusters for data is possible using general position label states or multi-qubit encodings, with qubit number scaling as or , and coresets providing a tractable route for large (Yogendran et al., 2024).
- Measurement Overhead: For coreset-based clustering, Hamiltonian term count is ; mitigation via Pauli grouping and shadow tomography is suggested for near-term devices.
- Unsolved Theoretical Problems: For some quantum-tailored coresets, worst-case approximation guarantees are not yet established (Yung et al., 2023). Clustering for with qutrits or higher-dimensional encodings is open.
- Noise and Error Mitigation: Shallow, parallelizable circuits and error mitigation protocols are critical for NISQ device feasibility (Bermejo et al., 2022, Zhang et al., 2021, Ghasempouri et al., 2023).
- Hybrid Adaptive Methods: Integration of adaptive ansatz growth (e.g., ADAPT-VQE) and natural-gradient preconditioning may alleviate barren plateaus and accelerate convergence in high-dimensional parameter spaces (Yogendran et al., 2024).
7. Comparison of ClusterVQE Implementations
| Variant | Application Domain | Quantum Resource Scaling | Classical Preprocessing |
|---|---|---|---|
| Data-driven ClusterVQE (Bermejo et al., 2022) | Clustering (k ≥ 2) | qubits | Precompute distances, label states |
| Coreset-VQE (Yung et al., 2023, Yogendran et al., 2024) | Clustering, Big Data | qubits (coreset) | Coreset construction, Hamiltonian assembly |
| Qubit-ClusterVQE (Zhang et al., 2021) | Quantum chemistry | qubits (per cluster) | Mutual information, graph partitioning, Hamiltonian dressing |
| Modular Cluster Circuits (Ghasempouri et al., 2023) | Quantum chemistry/TFIM | qubits, depth | Valence bond tiling, circuit templating |
The ClusterVQE paradigm enables variational quantum and quantum-inspired algorithms for clustering and correlated systems to be applied efficiently on near-term hardware. This is achieved through hybridization of classical clustering (in data or qubit space), local circuit designs, and cost function strategies, as well as through data reduction (coresets) and tensor network simulation (Bermejo et al., 2022, Zhang et al., 2021, Ghasempouri et al., 2023, Yung et al., 2023, Yogendran et al., 2024, Beaulieu et al., 2021, Romero et al., 2017).