EnhancedMPNN: Protein Design Optimization
- EnhancedMPNN is a protein design model that optimizes sequence designability by fine-tuning LigandMPNN with residue-level objectives based on AlphaFold2 pLDDT scores.
- It employs two phases—Direct Preference Optimization (DPO) and Residue-level Designability Preference Optimization (ResiDPO)—to improve design success for enzymes and binders.
- EnhancedMPNN achieves up to a 3-fold increase in sequence success rates by integrating detailed residue-wise adjustments that enhance structural accuracy and sequence recovery.
EnhancedMPNN is a protein sequence design model obtained via residue-level preference optimization atop LigandMPNN, a ligand-aware extension of the ProteinMPNN architecture. EnhancedMPNN directly optimizes for designability—the likelihood that a generated sequence folds to its target structure—using novel fine-tuning objectives based on AlphaFold2 pLDDT scores. This approach departs from the traditional focus on sequence recovery and achieves a substantial enhancement in in silico design success, particularly for enzyme and binder scaffolding tasks (Xue et al., 30 May 2025).
1. Architecture and Foundations
EnhancedMPNN maintains the architecture of LigandMPNN, itself an extension of ProteinMPNN. The model processes fixed three-dimensional protein backbones, representing residues as nodes in a geometric graph with edges encoding spatial and orientational relationships. LigandMPNN extends this formalism by additionally representing ligand atoms and their coordinates within the graph, enabling the joint modeling of side-chain packing and ligand interactions.
No structural changes are made to the neural network in EnhancedMPNN; instead, optimization occurs entirely at the level of parameter fine-tuning. The core methodological innovation lies in preference-based alignment to structural accuracy, leveraging external structure prediction signals.
2. Preference Optimization: DPO and ResiDPO
EnhancedMPNN is trained in two preference-alignment phases: Direct Preference Optimization (DPO) and Residue-level Designability Preference Optimization (ResiDPO).
2.1 Direct Preference Optimization (DPO)
DPO employs a dataset , where represents the backbone context, a sequence with higher predicted pLDDT, and a sequence with lower pLDDT. The DPO objective fine-tunes a policy , initialized from LigandMPNN (), to prefer sequences with higher pLDDT, using the following loss: where is the sigmoid and regulates preference strength versus regularization. AlphaFold2 pLDDT scores serve as the preference indicator, and pairs are selected when .
2.2 Residue-level Designability Preference Optimization (ResiDPO)
ResiDPO extends preference optimization to the residue level, decoupling updates to focus on positions with maximal gain and regularizing high-confidence regions. The loss decomposes as: 0
- Residue-level Preference Learning (RPL): Targets indices 1 where strong local gains occur.
- Residue-level Constraint Learning (RCL): Regularizes positions 2 to preserve high-confidence assignments.
If 3 is empty, the full sequence is selected for RPL. All hyperparameters (4) are set via grid search and ablation.
3. Training Protocol
Dataset Construction
Training utilizes a curated, monomeric subset of the PDB (PDB-D), filtered for X-ray resolution 5 3.5 Å and chains 6 1000 residues. AlphaFold2 is run on each wild-type chain to generate per-residue pLDDT labels. The data are split by release date and clustered to prevent data leakage, yielding approximately 19,203 training and 1,690 validation backbones.
LigandMPNN samples multiple (typically eight) sequence candidates per backbone, which are then scored by AlphaFold2 for global and residue-level pLDDT. Preference pairs are generated via relative sampling, enforcing a pLDDT difference threshold 7, resulting in approximately 9,557 training pairs.
Optimization and Implementation
Training follows standard mini-batch stochastic gradient descent (Adam optimizer, initial learning rate 8, cosine decay, batch size 8, gradient accumulation 16, 9k iterations) on two NVIDIA L40 GPUs. Fine-tuning proceeds first with DPO, followed by ResiDPO.
Below is the essential pseudocode for ResiDPO fine-tuning:
6
4. Evaluation and Empirical Results
EnhancedMPNN is evaluated on enzyme and protein-protein binder benchmarks using the following metrics:
- In silico design success rate: Fraction of designed sequences with 0 and Cα-RMSD 1 1.5 Å (enzymes) or 2, inter-chain PAE 3 10, RMSD 4 1 Å (binders).
- Backbone success rate: Fraction of backbones with at least one successful sequence.
- pLDDT Accuracy: Correlation between model likelihood and actual pLDDT, primarily for ablation studies.
Quantitative Results
| Model | Sequence Success (%) | Backbone Success (%) |
|---|---|---|
| LigandMPNN | 6.56 | 19.74 |
| DPO Fine-tuned | ~9 | — |
| EnhancedMPNN (ResiDPO) | 17.57 | 40.34 |
- EnhancedMPNN achieves a nearly 3-fold increase in sequence success rate and more than 2-fold increase in backbone success rate over LigandMPNN on challenging enzyme design tasks.
- On binder benchmarks, EnhancedMPNN improves sequence success from 7.07% (LigandMPNN) and 10.40% (DPO) to 16.07%.
- Statistical tests on per-residue pLDDT distributions confirm that ResiDPO’s gains are highly significant (5).
Ablation experiments confirm the distinct contributions of RPL (local improvement, but with reduced sequence recovery) and RCL (recovery-preserving regularization), with full ResiDPO achieving the highest overall designability (pLDDT Acc = 66.08%) while preserving sequence recovery (55.56%).
5. Mechanistic Insights
ResiDPO enhances designability by applying residue-wise decomposition to pinpoint regions of low local structure confidence and selectively reinforce assignment probabilities for more favorable amino acids as suggested by pLDDT. High-confidence regions receive only minimal regularization, which preserves the pretrained model’s prior knowledge and prevents catastrophic forgetting.
This targeted preference learning resolves ambiguous or conflicting gradients that hinder full-sequence optimization, yielding both more stable and efficient alignment toward higher designability.
6. Limitations and Prospective Directions
Several limitations are recognized:
- Dependence on AlphaFold2 pLDDT: While a strong correlate of folding accuracy, pLDDT is not an explicit marker of biochemical function such as binding affinity or catalytic efficacy.
- Computational cost: The creation of preference datasets requires large-scale AlphaFold2 inference, limiting throughput. Future work may explore alternate or surrogate structure-quality predictors.
- Generalizability: The ResiDPO framework can be extended to additional objectives (solvent accessibility, Rosetta metrics, stability, immunogenicity) by adapting the reward and residue masking schemes.
- Experimental validation: As of the current study, all results are in silico, and the folding and function of EnhancedMPNN-designed sequences in vitro/in vivo remain to be confirmed.
7. Significance and Outlook
By integrating residue-level structural rewards into the optimization of a state-of-the-art sequence design network, EnhancedMPNN marks a methodological advance for computational protein engineering. The substantial improvement in the predicted folding accuracy of designed sequences demonstrates that fine-tuning with designability-centric objectives can directly bridge the gap between generation and functional adoption of novel protein folds (Xue et al., 30 May 2025). This approach provides a template for incorporating structural evaluation signals into generative modeling for biomolecular systems and suggests that further gains are possible by expanding to diverse reward signals and tasks.