- The paper demonstrates that AlphaFold 3 achieves high spatial accuracy (e.g., RMSD < 3 Å for 5TZ2) in CD47 complexes using advanced deep learning and reduced MSA dependence.
- The paper reveals an unexpected reverse docking fault, with misdocked predictions (53/81 for D2510) occurring despite high ipTM/pTM confidence scores.
- The paper highlights that while local affinity predictions via MM/GBSA are promising, challenges in global ranking suggest a need for improved MSA sampling and integration of physics-based restraints.
AlphaFold 3 Performance and Reverse Docking in CD47 Antibody-Antigen Affinity Prediction
Introduction
AlphaFold 3 (AF3) advances structural bioinformatics by extending deep learning-based protein structure prediction to general biomolecular complexes, incorporating proteins, nucleic acids, ions, and small molecules. Employing the Evoformer and diffusion modules, AF3 exhibits reduced multiple sequence alignment (MSA) reliance and increased computational efficiency. This study evaluates AF3’s ability to predict binding affinities and structures of CD47 antibody-antigen complexes, with particular attention to the unexpected emergence of a structural fault mode termed "reverse docking."
Methodological Framework
CD47-targeting antibody sequences were obtained from Protein Data Bank (PDB) and proprietary sources. The extracellular domain (ECD) of CD47 served as the antigen. Homology models were constructed via Swiss-Model and Discovery Studio. AF3, AFM (AlphaFold-Multimer), and commercial docking algorithms (HDOCK, ZDOCK, PIPER) predicted complex structures. MM/GBSA, implemented in Schrodinger's Prime, quantified relative binding free energy (RBFE) as an affinity metric. Accuracy was benchmarked by root-mean-square deviation (RMSD) against X-ray crystallographic references and ipTM/pTM scores, representing global and interface similarity, respectively. Spearman’s rank correlation and a weighted scoring system assessed affinity prediction reliability.
Evaluation of AF3 Structure Prediction Fidelity
Five CD47 antibody-antigen complexes with high-resolution PDB structures were evaluated. AF3 achieved RMSD < 3 Å in most cases, confirming credible spatial accuracy, with markedly high performance for 5TZ2 (0.79 Å RMSD; ipTM ≈ 0.89, pTM ≈ 0.90). AF3 reliably recapitulated key contact residues in the antigen-binding interface for C47B222 (5TZ2 antibody), including ASP-55, ARG-59, and HIS-105. For the diabody 5F9, prediction was less accurate (RMSD = 9.08 Å), highlighting persistent limitations for large or multimeric antibodies.
Comparative Affinity Prediction: RBFE and Ranking
MM/GBSA-derived RBFE values from AF3 were closest to experimental benchmarks (ARBFE = 11.21 kJ/mol), outperforming AFM and commercial docking alternatives. In affinity ranking of four antibodies (D0604, 5TZ2, D2510, 5F9), AF3, ZDOCK, and PIPER generated rankings, but none reproduced the true sequence; AF3 correlation coefficient was 0.0. Despite this, pairwise affinity relationships were more accurately predicted (AF3 local accuracy score: 470), suggesting reliable discrimination for in-pair comparisons but diminished global ranking efficacy.
AF3’s architectural innovations (minimal MSA blocks, diffusion-based generation) enabled handling of multiple chain inputs and reduced computational time (typical single prediction ~5 minutes), which is advantageous for high-throughput protein engineering and antibody screening.
Discovery and Analysis of Reverse Docking Fault Mode
Reverse docking, characterized by AF3 assigning antibody-antigen interfaces in an incorrect, reversed orientation, was notably frequent for D2510 and D2523 antibodies (53/81 predictions for D2510). Statistical analysis (t-test p ≪ 1e-32) refuted coincidence. Intriguingly, these misdocked models retained high ipTM and pTM confidence scores, revealing a disconnect between model confidence and interface validity.
The architectural mechanism underlying reverse docking is linked to AF3’s minimized MSA dependence. Without sufficient sequence co-evolutionary information from MSA, AF3’s confidence/denoising modules potentially favor template- or geometry-based assembly, misassigning interaction directionality, especially in families lacking high-quality reference structures.
Attempts to correct reverse docking by extending antigen sequences (i.e., including CD47 transmembrane regions) normalized docking for D2510 but not for D2523, accompanied by substantially lower interface scores, indicating limited generalizability and reduced prediction confidence for full-length antigens with unresolved domains.
Implications, Limitations, and Future Directions
The emergence of reverse docking highlights a critical challenge in data-driven modeling: reliance on confidence scores not aligned with biochemical reality, especially when template information supersedes evolutionary context. Ignoring unresolved regions or PTMs in antigens, a limitation of this study, may have confounded some predictions. The findings advocate for optimizing AF3’s MSA input, potentially via improved sampling or machine selection of homologous templates. Allowing human-interactive specification of interface regions, as done in HDOCK, may mitigate reverse docking.
AF3 was found to be robust for RNA-protein, peptide-protein, and small molecule-protein complexes; nevertheless, for highly specialized targets such as antibody-antigen interactions, future retraining or model augmentation is warranted. Integration of physics-based restraints or hybrid docking strategies may further regulate generative architectures, constraining hallucination phenomena and improving global ranking outcomes.
Rapid affinity screening using AF3 enables prioritization of antibody candidates for experimental validation and rational mutation via in silico affinity selection, potentially supplanting traditional hybridoma and phage display screens. However, comprehensive validation across other complex classes is needed to generalize model reliability.
Conclusion
AF3 substantially improves the prediction of antibody-antigen complex structures and relative binding affinities compared to earlier machine learning and physics-based docking methods. It demonstrates high spatial accuracy and reliable contact mapping but is susceptible to erroneous interface assignments in the absence of high-quality MSA data, as exemplified by the reverse docking phenomenon. The model’s computational efficiency and flexibility hold promise for accelerating antibody engineering and drug discovery pipelines, albeit with caveats regarding global affinity ranking and complex fault modalities. Future work should focus on resolving reverse docking through enhanced evolutionary context, user-guided input, and integration of physics-based constraints to fully realize AF3’s potential in structural immunology and beyond.