Structure-based drug design by denoising voxel grids (2405.03961v2)
Abstract: We present VoxBind, a new score-based generative model for 3D molecules conditioned on protein structures. Our approach represents molecules as 3D atomic density grids and leverages a 3D voxel-denoising network for learning and generation. We extend the neural empirical Bayes formalism (Saremi & Hyvarinen, 2019) to the conditional setting and generate structure-conditioned molecules with a two-step procedure: (i) sample noisy molecules from the Gaussian-smoothed conditional distribution with underdamped Langevin MCMC using the learned score function and (ii) estimate clean molecules from the noisy samples with single-step denoising. Compared to the current state of the art, our model is simpler to train, significantly faster to sample from, and achieves better results on extensive in silico benchmarks -- the generated molecules are more diverse, exhibit fewer steric clashes, and bind with higher affinity to protein pockets. The code is available at https://github.com/genentech/voxbind/.
- Equivariant shape-conditioned generation of 3D molecules for ligand-based drug design. arXiv:2210.04893, 2022.
- Segdiff: Image segmentation with diffusion probabilistic models. arXiv:2112.00390, 2021.
- Anderson, A. C. The process of structure-based drug design. Chemistry & biology, 2003.
- Are transformers more robust than cnns? NeurIPS, 2021.
- Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? Journal of cheminformatics, 2015.
- Mace: Higher order equivariant message passing neural networks for fast and accurate force fields. Neurips, 2022.
- Quantifying the chemical beauty of drugs. Nature chemistry, 2012.
- Blundell, T. L. Structure-based drug design. Nature, 1996.
- Autodock vina 1.2. 0: New docking methods, expanded force field, and python bindings. JCIM, 2021.
- Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 2018.
- Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of cheminformatics, 2009.
- Virtual exploration of the small-molecule chemical universe below 160 daltons. Angewandte Chemie International Edition, 2005.
- Language models can generate molecules, materials, and protein binding sites directly in three dimensions as xyz, cif, and pdb files. arXiv:2305.05708, 2023.
- Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. Journal of chemical information and modeling, 2020.
- Protein discovery with discrete walk-jump sampling. In ICLR, 2024.
- Generating equilibrium molecules with deep neural networks. arXiv:1810.11347, 2018.
- Symmetry-adapted generation of 3D point sets for the targeted discovery of molecules. In NeurIPS, 2019.
- E3nn: Euclidean neural networks. arXiv:2207.09453, 2022.
- The Lie derivative for measuring learned equivariance. arXiv:2210.02984, 2022.
- 3D equivariant diffusion for target-aware molecule generation and affinity prediction. ICLR, 2023a.
- DecompDiff: Diffusion models with decomposed priors for structure-based drug design. In ICML, 2023b.
- Halgren, T. A. Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of computational chemistry, 1996.
- Benchmarking generated poses: How rational is structure-based drug design with generative models? arXiv:2308.0741, 2023.
- Equivariant diffusion for molecule generation in 3D. In ICML, 2022.
- Mdm: Molecular diffusion model for 3D molecule generation. arXiv:2209.05710, 2022.
- Hyvärinen, A. Estimation of non-normalized statistical models by score matching. JMLR, 2005.
- Auto-encoding variational Bayes. In ICLR, 2014.
- Equivariant flows: exact likelihood generative learning for symmetric densities. In ICML, 2020.
- Landrum, G. Rdkit: Open-source cheminformatics software, 2016. URL https://github.com/rdkit/rdkit/releases/tag/Release_2016_09_4.
- On the modeling of polar component of solvation energy using smooth gaussian-based dielectric function. Journal of Theoretical and Computational Chemistry, 2014.
- Generating 3D molecules for target protein binding. arXiv, 2022.
- Zero-shot 3d drug design by sketching and generating. NeurIPS, 2022.
- Decoupled weight decay regularization. In ICLR, 2019.
- Repaint: Inpainting using denoising diffusion probabilistic models. In CVPR, 2022.
- A 3D generative model for structure-based drug design. NeurIPS, 2021.
- An autoregressive flow model for 3D molecular geometry generation from scratch. In ICLR, 2022.
- Ultra-large library docking for discovering new chemotypes. Nature, 2019.
- Miyasawa, K. An empirical Bayes estimator of the mean of a normal population. Bull. Inst. Internat. Statistics, 1961.
- Weisfeiler and leman go neural: Higher-order graph neural networks. In AAAI, 2019.
- Open babel: An open chemical toolbox. Journal of cheminformatics, 2011.
- Pyuul provides an interface between biological structures and deep learning algorithms. Nature communications, 2022.
- Pocket2mol: Efficient molecular sampling based on 3D protein pockets. In ICML, 2022.
- 3D molecule generation by denoising voxel grids. In NeurIPS, 2023.
- Geometric deep learning for structure-based ligand design. ACS Central Science, 2023.
- Incompleteness of graph convolutional neural networks for points clouds in three dimensions. arXiv:2201.07136, 2022.
- Learning a continuous representation of 3D molecular structures with deep generative models. In Neurips, Structural Biology workshop, 2020.
- Generating 3D molecules conditional on receptor binding sites with deep generative models. Chemical science, 2022.
- Uff, a full periodic table force field for molecular mechanics and molecular dynamics simulations. Journal of the American chemical society, 1992.
- Variational inference with normalizing flows. In ICML, 2015.
- Robbins, H. E. An empirical Bayes approach to statistics. In Proc. 3rd Berkeley Symp. Math. Statist. Probab., 1956, 1956.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- U-Net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.
- Alphaspace: Fragment-centric topographical mapping to target protein–protein interaction interfaces. Journal of chemical information and modeling, 2015.
- Structure-based drug design via semi-equivariant conditional normalizing flows. In ICLR, Machine Learning for Drug Discovery workshop, 2023.
- Langevin dynamics with variable coefficients and nonconservative forces: from stationary states to numerical methods. Entropy, 2017.
- Palette: Image-to-image diffusion models. In SIGGRAPH, 2022a.
- Photorealistic text-to-image diffusion models with deep language understanding. In NeurIPS, 2022b.
- Image super-resolution via iterative refinement. PAMI, 2022c.
- Neural empirical Bayes. JMLR, 2019.
- Universal smoothed score functions for generative modeling. arXiv:2303.11669, 2023.
- E (n) equivariant graph neural networks. In ICML, 2021.
- The surprising effectiveness of diffusion models for optical flow and monocular depth estimation. arXiv:2306.01923, 2023.
- Structure-based drug design with equivariant diffusion models. arXiv:2210.13695, 2022.
- Shape-based generative modeling for de novo drug design. Journal of chemical information and modeling, 2019.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, 2015.
- MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature biotechnology, 2017.
- Integrating structure-based approaches in generative molecular design. Current Opinion in Structural Biology, 2023.
- Atom3D: Tasks on molecules in three dimensions. NeurIPS, 2020.
- Midi: Mixed graph and 3D denoising diffusion for molecule generation. ICLR, MLDD workshop, 2023.
- A pocket-based 3D molecule generative model fueled by experimental electron density. Scientific reports, 2022a.
- Relation: A deep generative model for structure-based de novo drug design. Journal of Medicinal Chemistry, 2022b.
- Generating molecular conformer fields. arXiv:2311.17932, 2023.
- 3D steerable cnns: Learning rotationally equivariant features in volumetric data. NeurIPS, 2018.
- How powerful are graph neural networks? ICLR, 2019.
- Geometric latent diffusion models for 3D molecule generation. In ICML, 2023.
- Molecule generation for target protein binding with structural motifs. In ICLR, 2023.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.