BSE49 Dataset: Quantum Bond Benchmark

Updated 14 July 2025

BSE49 dataset is a curated benchmark offering high-accuracy bond separation energies for 49 distinct single bonds computed with the (RO)CBS–QB3 method.
It employs classical descriptors and dimensionality reduction techniques like PCA and SHAP to map molecular information onto quantum circuits.
The dataset facilitates rigorous comparisons between parametrized quantum circuits and classical models, highlighting performance trade-offs and noise challenges.

The BSE49 dataset is a curated benchmark for quantum chemical machine learning, comprising reference bond separation energies for 49 distinct single bonds, each computed at the high-accuracy (RO)CBS–QB3 level of theory. BSE49 spans bonds between a broad set of elements (H, B, C, N, O, F, Si, P, S, and Cl), deliberately excluding cases such as H–H, H–F, and H–Cl. The rigor and diversity of BSE49 make it suitable for the assessment and benchmarking of both classical and quantum chemical methodologies, especially in predicting bond separation energies.

1. Composition and Theoretical Foundation

BSE49’s underlying data are reference bond separation energies for 49 distinct classes of chemical bonds (X–Y), with X and Y drawn from a set of key main-group elements. Each entry was calculated using the (RO)CBS–QB3 protocol, a composite quantum chemistry method recognized for yielding highly accurate thermochemical data. This systematic selection ensures the inclusion of a wide spectrum of chemically and industrially relevant bond types, making BSE49 representative of common and challenging quantum chemistry scenarios.

The exclusion of trivial or computationally difficult bonds (e.g., H–H, H–F, H–Cl) focuses the benchmark on chemically meaningful and methodologically tractable systems.

2. Data Representation and Preprocessing

For the purposes of machine learning, molecules in BSE49 are initially described by classical descriptors, specifically Morgan fingerprints. These descriptors, typically with dimensions numbering in the hundreds or thousands, provide a robust encoding of molecular structure. To adhere to the hardware constraints of quantum circuits (5 or 16 qubits), dimensionality reduction is required. Two principal feature reduction strategies are utilized:

Principal Component Analysis (PCA): Yields a linear reduction of features, preserving maximal variance in the molecular data.
SHAP (SHapley Additive exPlanations) Analysis: Identifies and retains the most chemically informative features with respect to the regression target.

Post-reduction, the descriptors are commonly encoded into quantum circuits via angle encoding, directly mapping the reduced classical information onto quantum states.

3. Application and Assessment of Parametrized Quantum Circuits

A central focus is the deployment and evaluation of 168 distinct parametrized quantum circuits (PQCs) on BSE49. The circuits are constructed by systematically varying:

Data encoding strategies: Fourteen approaches are evaluated, affecting how classical features are transformed into quantum states.
Variational ansätze: Twelve architectural templates, differing in the choice of entangling gates, qubit connectivity, and layering, define trainable portions of the quantum circuits.
Re-uploading depth: Some designs re-inject data after the initial encoding to enhance the circuit’s expressive capacity.

A generic circuit is denoted as

$|\psi(\theta)\rangle = U(\theta)\,U_{enc}(x)\,|0\rangle^{\otimes n}$

where $U_{enc}(x)$ encodes the (reduced) molecular descriptors $x$ into the quantum state and $U(\theta)$ is the variational (trainable) unitary.

For predictive modeling, the quantum circuits are trained to estimate bond separation energies using supervised regression. Performance is quantified using the mean absolute error (MAE)

$\mathrm{MAE} = \frac{1}{N} \sum_{i=1}^N |y_i - \hat{y}_i|$

and the coefficient of determination (R²):

$R^2 = 1 - \frac{\sum_{i=1}^N (y_i - \hat{y}_i)^2}{\sum_{i=1}^N (y_i - \bar{y})^2}$

where $y_i$ are the reference bond energies and $\hat{y}_i$ are the PQC predictions.

4. Performance Analysis and Methodological Insights

The paper demonstrates that PQC performance depends acutely on circuit structure, data encoding, circuit depth, and training regimen. Key observations include:

Circuit expressivity: Deeper or more intricately entangled ansätze often yield increased capacity to fit complex patterns but pose greater optimization challenges.
Training set size: Learning curves illustrate that increased training data generally improves PQC regression accuracy. With fewer examples, some architectures are unable to generalize, revealing a dependence of predictive power on the interplay between model complexity and data availability.
Encoding impact: The ability of PQCs to exploit chemically relevant encodings (via PCA or SHAP) significantly affects outcomes, as different circuits adapt variably to the representation of input data.

Detailed tabular results enumerate MAE and R² for various circuit combinations, highlighting which architectures and reduction strategies are most effective in capturing the bond energy trends present in BSE49.

5. Comparison with Classical Machine Learning Benchmarks

Classical machine learning baselines, including ridge regression, lasso, and Gaussian process regression, provide a reference against which PQC performance is measured. When limited to the same reduced feature sets as PQCs,

Classical regressors generally exhibit high accuracy on BSE49, especially when using PCA-reduced fingerprints. These models, however, are sometimes prone to overfitting due to their flexibility and the relatively small size of the dataset.
PQC models exhibit a degree of regularization by design, stemming from architectural constraints and the finite number of variational parameters. In certain configurations, this results in less overfitting relative to analogous classical models for small datasets.

A unique limitation for PQCs arises from the need to project high-dimensional classical data into a small quantum feature space, which can restrict their capacity relative to classical models unconstrained by qubit count.

6. Evaluation on Quantum Hardware and Physical Realism

The paper extends analysis from noiseless (state-vector) simulations to both noise-model-based simulations and real quantum hardware:

Noise impact: PQC models evaluated under realistic, hardware-inspired noise models demonstrate a significant reduction in predictive accuracy compared to noiseless simulations, reflecting the practical difficulties of near-term quantum devices.
Error mitigation: Application of error mitigation techniques, such as extrapolation and measurement-based correction, allows small PQCs to partially recover accuracy, achieving results comparable to their simulated counterparts for small qubit counts.
NISQ constraints: The findings indicate that while PQCs can, in principle, address problems as embodied in BSE49, their near-term utility depends critically on improving noise resilience and error correction strategies.

7. Significance and Prospects

BSE49 serves as a stringent benchmark for quantum machine learning approaches in quantum chemistry, enabling the systematic assessment of quantum models’ ability to learn and predict chemically meaningful targets. The results underscore both the promise and the current limitations of PQCs:

Promise: Quantum circuits, when paired with effective encoding and variational strategies, can yield competitive regression performance for chemical problems, sometimes exhibiting beneficial regularization.
Limitations: The primary constraints are the sensitivity to noise, the burden of mapping high-dimensional chemistry data onto modest hardware, and the classical optimization difficulties typical of PQC training (e.g., barren plateaus).

A plausible implication is that continued advances in encoding schemes, circuit design, and quantum hardware are necessary for PQCs to realize their full potential in quantum chemistry and related molecular property prediction tasks, with BSE49 providing a robust platform for ongoing benchmarking and evaluation.

PDF Markdown Chat (Pro)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to BSE49 Dataset.

Continue Learning

We haven't generated follow-up questions for this topic yet.

Generate Now