AntiFold: Improved antibody structure-based design using inverse folding (2405.03370v1)
Abstract: The design and optimization of antibodies requires an intricate balance across multiple properties. Protein inverse folding models, capable of generating diverse sequences folding into the same structure, are promising tools for maintaining structural integrity during antibody design. Here, we present AntiFold, an antibody-specific inverse folding model, fine-tuned from ESM-IF1 on solved and predicted antibody structures. AntiFold outperforms existing inverse folding tools on sequence recovery across complementarity-determining regions, with designed sequences showing high structural similarity to their solved counterpart. It additionally achieves stronger correlations when predicting antibody-antigen binding affinity in a zero-shot manner, while performance is augmented further when including antigen information. AntiFold assigns low probabilities to mutations that disrupt antigen binding, synergizing with protein LLM residue probabilities, and demonstrates promise for guiding antibody optimization while retaining structure-related properties. AntiFold is freely available under the BSD 3-Clause as a web server at https://opig.stats.ox.ac.uk/webapps/antifold/ and and pip installable package at https://github.com/oxpig/AntiFold
- ImmuneBuilder: Deep-Learning models for predicting the structures of immune proteins. Communications Biology, 6(1):1–8, 2023. ISSN 23993642. 10.1038/s42003-023-04927-7.
- Protein sequence design with a learned potential. Nature Communications, 13(1):1–11, 2022. ISSN 20411723. 10.1038/s41467-022-28313-9.
- Robust deep learning–based protein sequence design using proteinmpnn. Science, 378(6615):49–56, 2022. 10.1126/science.add2187.
- Inverse folding for antibody sequence design using deep learning. The 2023 ICML Workshop on Computational Biology, 2023.
- SAbDab: The structural antibody database. Nucleic Acids Research, 42(D1):1140–1146, 2014. 10.1093/nar/gkt1043.
- Efficient evolution of human antibodies from general protein language models. Nature Biotechnology, Apr 2023. ISSN 1546-1696. 10.1038/s41587-023-01763-2.
- Learning inverse folding from millions of predicted structures. bioRxiv, 2022. 10.1101/2022.04.10.487779.
- Advances in computational structure-based antibody design. Current Opinion in Structural Biology, 74:102379, 2022. ISSN 0959-440X. https://doi.org/10.1016/j.sbi.2022.102379.
- Generative models for graph-based protein design. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Learning from protein structure with geometric vector perceptrons, 2021.
- Observed Antibody Space: A Resource for Data Mining Next-Generation Sequencing of Antibody Repertoires. The Journal of Immunology, 201(8):2502–2509, 10 2018. ISSN 0022-1767. 10.4049/jimmunol.1800708.
- Imgt unique numbering for immunoglobulin and t cell receptor variable domains and ig superfamily v-like domains. Developmental & Comparative Immunology, 27(1):55–77, 2003. https://doi.org/10.1016/S0145-305X(02)00039-3.
- Development of therapeutic antibodies for the treatment of diseases. Journal of Biomedical Science, 27(1):1–30, 2020. ISSN 14230127. 10.1186/s12929-019-0592-z.
- Co-optimization of therapeutic antibody affinity and specificity using machine learning models that generalize to novel mutational space. Nature Communications, 13(1), 2022. ISSN 20411723. 10.1038/s41467-022-31457-3.
- Optimization of therapeutic antibodies for reduced self-association and non-specific binding via interpretable machine learning. Nature Biomedical Engineering, 2023. 10.1038/s41551-023-01074-6.
- Humanization of antibodies using a machine learning approach on large-scale repertoire data. Bioinformatics, 37(22):4041–4047, 2021. ISSN 1367-4803. 10.1093/bioinformatics/btab434.
- Colabfold: making protein folding accessible to all. Nature Methods, 19(6):679–682, 2022. ISSN 1548-7105. 10.1038/s41592-022-01488-1. URL https://doi.org/10.1038/s41592-022-01488-1.
- Observed antibody space: A diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences. Protein Science, 2022a. https://doi.org/10.1002/pro.4205.
- Observed antibody space: A diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences. Protein Science, 31(1):141–146, 2022b. https://doi.org/10.1002/pro.4205.
- C. Outeiral and C. M. Deane. Perfecting antibodies with language models. Nature Biotechnology, 42(2):185–186, 2024. ISSN 1546-1696. 10.1038/s41587-023-01991-6.
- Biophi: A platform for antibody design, humanization, and humanness evaluation. mAbs, 14(1):2020203, 2022. 10.1080/19420862.2021.2020203.
- Understanding and overcoming trade-offs between antibody affinity, specificity, stability and solubility. Biochemical Engineering Journal, 137:365–374, 2018. ISSN 1369-703X. https://doi.org/10.1016/j.bej.2018.06.003.
- The h3 loop of antibodies shows unique structural characteristics. Proteins: Structure, Function, and Bioinformatics, 85(7):1311–1318, 2017. https://doi.org/10.1002/prot.25291.
- SAbDab in the age of biotherapeutics: updates including SAbDab-nano, the nanobody structure tracker. Nucleic Acids Research, 50(D1):D1368–D1372, 11 2021. 10.1093/nar/gkab1050.
- Schrödinger, LLC. The PyMOL molecular graphics system, version. November 2015.
- In vitro validated antibody design against multiple therapeutic antigens using generative inverse folding. bioRxiv, 2023. 10.1101/2023.12.08.570889.
- Inverse folding of protein complexes with a structure-informed language model enables unsupervised antibody evolution. bioRxiv, 2023. 10.1101/2023.12.19.572475.
- Antibody structure. Microbiology Spectrum, 2(2):10.1128/microbiolspec.aid–0012–2013, 2014. 10.1128/microbiolspec.aid-0012-2013.
- Fast and flexible protein design using deep graph neural networks. Cell Systems, 11(4):402–411.e4, 2020. ISSN 2405-4712. https://doi.org/10.1016/j.cels.2020.08.016.
- How to fine-tune BERT for text classification? CoRR, abs/1905.05583, 2019. URL http://arxiv.org/abs/1905.05583.
- Computational optimization of antibody humanness and stability. Nature Biomedical Engineering, 2023. 10.1038/s41551-023-01079-1.
- Attention is all you need. 2023.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020. 10.1038/s41592-019-0686-2.
- Optimizing antibody affinity and stability by the automated design of the variable light-heavy chain interfaces. PLOS Computational Biology, 15(8):1–24, 08 2019. 10.1371/journal.pcbi.1007207.
- Structural basis for the neutralization and specificity of staphylococcal enterotoxin b against its mhc class ii binding site. mAbs, 6(1):119–129, 2014. 10.4161/mabs.27106. PMID: 24423621.