DualBind Model: 3D Binding Affinity Predictor

Updated 16 July 2025

DualBind Model is a 3D deep learning framework that integrates supervised MSE and unsupervised DSM losses to accurately predict protein–ligand binding affinities.
Its architecture leverages full atomic features and frame averaging to achieve rotational and translational invariance in modeling protein–ligand interactions.
The dual-loss paradigm enables scalable virtual screening and efficient drug discovery by reducing reliance on extensive labeled data.

The DualBind model is a three-dimensional, structure-based deep learning framework designed for accurate prediction of protein–ligand binding affinities. By combining a supervised mean squared error (MSE) loss with an unsupervised denoising score matching (DSM) loss within an SE(3)-invariant neural network, DualBind addresses both the scarcity of reliable labeled data and the limitations of physical simulation-based affinity prediction. The model is validated on large-scale datasets such as ToxBench, and is shown to achieve high accuracy and generalizability. Its architecture and training approach allow it to leverage both labeled and unlabeled data, offering a scalable solution for virtual screening and drug discovery by approximating computationally intensive physical binding free energy calculations at a fraction of the cost and time.

1. Architectural Foundations

DualBind operates directly on the full three-dimensional (3D) representation of protein–ligand complexes. Each complex is denoted as $C = (A, X)$ , where $A \in \mathbb{R}^{n \times d}$ encodes atomic features (for both protein and ligand atoms) and $X \in \mathbb{R}^{n \times 3}$ provides atom coordinates. The central model component is a learnable energy function $E_\theta(A, X)$ mapping the 3D structure to a scalar binding energy.

A key architectural element in DualBind is its SE(3)-invariant backbone, constructed with a frame averaging neural network. This guarantees that the model’s output is invariant to rotations and translations, a crucial property in molecular systems. The model processes atom-level features and spatial relationships by constructing interaction-aware representations through pairwise atomic interaction layers. Atom-wise attention mechanisms are integrated within the frame averaging, enabling DualBind to capture complex relational patterns and spatial context within the protein–ligand interface. Notably, the DualBind framework is model-agnostic regarding the precise parameterization of $E_\theta(A, X)$ , permitting integration with alternative 3D interaction modeling architectures.

2. Dual-Loss Learning Principle

DualBind’s training objective is the sum of two complementary terms:

Supervised Loss (MSE) For labeled complexes with ground truth binding affinity $y$ , the MSE loss:

$\mathcal{L}_{\mathrm{MSE}} = [E_\theta(A, X) - y]^2$

directly encourages the model to approximate the experimentally measured (or AB-FEP-calculated) binding free energy.

Unsupervised Loss (DSM) The denoising score matching (DSM) loss shapes the learned energy landscape based on perturbed ligand coordinates. For every ligand atom $x_i$ , a noisy coordinate $\tilde{x}_i = x_i + \sigma \varepsilon_i$ (with $\varepsilon_i \sim \mathcal{N}(0, I_3)$ ) is generated. The DSM loss is then:

$\mathcal{L}_{\mathrm{DSM}} = \mathbb{E}_{q(\tilde{X}|X) p_{\mathrm{data}}(X)} \left[ \left\| \nabla_{\tilde{X}} E_\theta(A, \tilde{X}) + \frac{\tilde{X} - X}{\sigma^2} \right\|^2 \right]$

which encourages the model’s energy gradients to recover the true minima of the data distribution.

The total loss is:

$\mathcal{L}_{\mathrm{total}} = \mathcal{L}_{\mathrm{MSE}} + \lambda \mathcal{L}_{\mathrm{DSM}}$

where $\lambda$ is a hyperparameter balancing the two terms (set to 2 in ToxBench experiments (Liu et al., 11 Jul 2025)). This learning paradigm allows the model to benefit both from precise affinity measurements and from structural information available even in the absence of labels.

3. Training and Data Utilization

DualBind is trained on large, label-rich datasets like ToxBench, which contains 8,770 ERα–ligand complexes with binding free energies calculated using absolute binding free energy perturbation (AB–FEP). The dataset is split into training, validation, and testing sets along non-overlapping ligand partitions to enforce generalization and prevent ligand-identity shortcuts.

During training:

For labeled entries, both $\mathcal{L}_{\mathrm{MSE}}$ and $\mathcal{L}_{\mathrm{DSM}}$ are used.
For unlabeled entries (i.e., structures without experimentally measured or AB-FEP labels), only $\mathcal{L}_{\mathrm{DSM}}$ is applied.

The practical data pipeline is implemented using PyTorch/PyTorch Lightning, with optimization performed by Adam at a learning rate of $5 \times 10^{-4}$ . Typical batch sizes and hyperparameters such as noise scale $\sigma$ , dropout rate, and loss weights are tuned on the validation set. Training is distributed across 8 NVIDIA A100 GPUs for convergence over 120 epochs (Liu et al., 11 Jul 2025).

4. Binding Energy Function and Physical Plausibility

DualBind’s energy function $E_\theta(A, X)$ learns to mirror the physically meaningful binding free energy landscape. Through the supervised MSE term, the model is explicitly anchored to high-fidelity affinity values (such as those from AB–FEP). The unsupervised DSM loss steers the model to place local minima of $E_\theta$ near ground-truth-binding geometries, enforcing that even slightly perturbed complexes are energetically penalized.

The DSM mechanism mathematically ensures that:

$\nabla_{\tilde{X}} E_\theta(A, \tilde{X}) \approx -\frac{\tilde{X} - X}{\sigma^2}$

for each perturbed sample $(A, \tilde{X})$ , which is the defining characteristic of denoising score matching for Gaussian noise. This strategy results in energy functions that are both predictive and physically grounded, facilitating not only affinity prediction but also continuous optimization in chemical space.

5. Benchmark Results and Comparative Assessment

Extensive evaluation is performed on the CASF-2016 benchmark and the ToxBench dataset (Liu et al., 11 Jul 2025, Liu et al., 11 Jun 2024). On ToxBench, DualBind achieves the following metrics:

Pearson correlation ( $R_p$ ): 0.844
Coefficient of determination ( $R^2$ ): 0.704
Spearman rank correlation ( $\rho$ ): 0.786
Root mean square error (RMSE): 2.392 kcal/mol

Comparison with baseline approaches demonstrates that DualBind surpasses both ligand–only and fixed–feature models, as well as existing supervised and DSM-only methods, particularly in ranking (virtual screening), absolute error, and correlation.

DSM-only models cannot directly yield absolute energy values and generally perform worse in ranking metrics due to their reliance on unrealistic Boltzmann distribution assumptions in real protein–ligand data. MSE-only models exhibit higher sensitivity to the amount of labeled data and are prone to overfitting. DualBind’s hybrid approach nearly matches the full-data MSE model’s performance even when only half the data have labels, provided the remaining unlabeled data are used in the DSM term.

6. Practical Applications

DualBind’s capabilities are directly relevant to several domains:

Drug Discovery: Rapid, accurate binding affinity estimation facilitates virtual screening and prioritization of large chemical libraries. With computation times in the tens of milliseconds per complex (compared to hours for AB–FEP), high-throughput application is feasible.
Toxicity Assessment: Specific validation on ERα-related toxicology (as in ToxBench) enables efficient risk profiling for endocrine disruptors in compound pipelines.
Lead Optimization: The physically regularized energy landscape and gradient availability enable downstream applications in ligand design and molecular docking, supporting optimization tasks.
Resource Efficiency: By integrating unlabeled data, DualBind maximizes information extraction from available structures, alleviating reliance on expensive experimental affinity measurements.

The dual-loss approach and general architecture are adaptable to other molecular interaction prediction tasks and can be extended or used as a backbone for generative modeling in computational biology.

7. Future Directions and Theoretical Context

DualBind exemplifies an emerging class of models that incorporate both supervision and unsupervised distributional regularization to balance accuracy and generalizability. Its SE(3)-invariant energy formulation aligns with trends in geometric deep learning. A plausible implication is that such frameworks could be extended to model other macromolecular interactions or guide structure-based generative models for molecular design.

Dynamic feature binding—addressed in related work for perception and vision—has conceptual analogies with the DualBind loss, as both aim to disambiguate compositional inputs into physically or semantically coherent outputs (Greff et al., 2015, Taghipour et al., 27 Feb 2024). The dual-objective principle can inform architectural and methodological choices in broader domains where labeled data is scarce but domain-specific constraints and priors are available.

In summary, DualBind’s design, mathematical grounding, and demonstrated empirical performance provide a robust and extensible foundation for affinity prediction tasks in computational chemistry and allied fields.