Holographic-(V)AE: an end-to-end SO(3)-Equivariant (Variational) Autoencoder in Fourier Space (2209.15567v2)
Abstract: Group-equivariant neural networks have emerged as a data-efficient approach to solve classification and regression tasks, while respecting the relevant symmetries of the data. However, little work has been done to extend this paradigm to the unsupervised and generative domains. Here, we present Holographic-(Variational) Auto Encoder (H-(V)AE), a fully end-to-end SO(3)-equivariant (variational) autoencoder in Fourier space, suitable for unsupervised learning and generation of data distributed around a specified origin in 3D. H-(V)AE is trained to reconstruct the spherical Fourier encoding of data, learning in the process a low-dimensional representation of the data (i.e., a latent space) with a maximally informative rotationally invariant embedding alongside an equivariant frame describing the orientation of the data. We extensively test the performance of H-(V)AE on diverse datasets. We show that the learned latent space efficiently encodes the categorical features of spherical images. Moreover, H-(V)AE's latent space can be used to extract compact embeddings for protein structure microenvironments, and when paired with a Random Forest Regressor, it enables state-of-the-art predictions of protein-ligand binding affinity.
- Geiger M, Smidt T (2022) e3nn: Euclidean Neural Networks arXiv:2207.09453 [cs].
- (2020) One molecular fingerprint to rule them all: drugs, biomolecules, and the metabolome Journal of Cheminformatics 12:43.
- (2018) 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data arXiv:1807.02547 [cs, stat].
- (2018) Spherical CNNs arXiv:1801.10130 [cs, stat].
- Thomas N, et al. (2018) Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds arXiv:1802.08219 [cs].
- (2018) Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network arXiv:1806.09231 [cs, stat].
- (2020) Spin-Weighted Spherical CNNs arXiv:2006.10731 [cs].
- (2020) SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks arXiv:2006.10503 [cs, stat].
- (2022) Geometric and Physical Quantities Improve E(3) Equivariant Message Passing arXiv:2110.02905 [cs, stat].
- Musaelian A, et al. (2022) Learning Local Equivariant Representations for Large-Scale Atomistic Dynamics arXiv:2204.05249 [cond-mat, physics:physics].
- (2022) E(n) Equivariant Graph Neural Networks arXiv:2102.09844 [cs, stat].
- Liao YL, Smidt T (2022) Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs arXiv:2206.11990 [physics].
- Batzner S, et al. (2022) E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials Nature Communications 13:2453 Number: 1 Publisher: Nature Publishing Group.
- Kramer MA (1992) Autoassociative neural networks Computers & Chemical Engineering 16:313–328.
- Kingma DP, Welling M (2019) An Introduction to Variational Autoencoders Foundations and Trends® in Machine Learning 12:307–392 arXiv:1906.02691 [cs, stat].
- Einstein A (1916) Die Grundlage der allgemeinen Relativitätstheorie Annalen der Physik pp 770–822.
- Tung WK (1985) Group Theory in Physics.
- Schmidt E (1907) Zur Theorie der linearen und nichtlinearen Integralgleichungen Mathematische Annalen 63:433–476.
- Kingma DP, Welling M (2013) Auto-Encoding Variational Bayes.
- Higgins I, et al. (2022) beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.
- Cobb OJ, et al. (2021) Efficient Generalized Spherical CNNs arXiv:2010.11661 [astro-ph].
- (2016) Layer Normalization arXiv:1607.06450 [cs, stat].
- (2020) UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction arXiv:1802.03426 [cs, stat].
- Deng L (2012) The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] IEEE Signal Processing Magazine 29:141–142 Conference Name: IEEE Signal Processing Magazine.
- Aldenderfer M, Blashfield R (1984) Cluster Analysis (SAGE Publications, Inc., 2455 Teller Road, Thousand Oaks California 91320 United States of America).
- Rosenberg A, Hirschberg J (2007) V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure (Association for Computational Linguistics, Prague, Czech Republic), pp 410–420.
- Lohit S, Trivedi S (2020) Rotation-Invariant Autoencoders for Signals on Spheres arXiv:2012.04474 [cs].
- (2022) Unsupervised Learning of Group Invariant and Equivariant Representations arXiv:2202.07559 [cs].
- Su M, et al. (2019) Comparative Assessment of Scoring Functions: The CASF-2016 Update Journal of Chemical Information and Modeling 59:895–913 Publisher: American Chemical Society.
- Townshend R, et al. (2021) ATOM3D: Tasks on Molecules in Three Dimensions Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1.
- (year?) SHREC 2017: Large-scale 3D Shape Retrieval from ShapeNet Core55.
- Berman HM, et al. (2000) The Protein Data Bank Nucleic Acids Research 28:235–242.
- Pun MN, et al. (2022) Learning the shape of protein micro-environments with a holographic convolutional neural network Pages: 2022.10.31.514614 Section: New Results.
- AlQuraishi M (2019) ProteinNet: a standardized data set for machine learning of protein structure BMC Bioinformatics 20:311.
- Shu Z, et al. (2018) Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance arXiv:1806.06503 [cs].
- (2020) Rate-Invariant Autoencoding of Time-Series pp 3732–3736 ISSN: 2379-190X.
- Mehr E, et al. (2018) Manifold Learning in Quotient Spaces pp 9165–9174 ISSN: 2575-7075.
- (2011) Transforming Auto-Encoders, Lecture Notes in Computer Science eds Honkela T, Duch W, Girolami M, Kaski S (Springer, Berlin, Heidelberg), pp 44–51.
- (2019) Stacked Capsule Autoencoders arXiv:1906.06818 [cs, stat].
- Feige I (2022) Invariant-equivariant representation learning for multi-class data.
- Drautz R (2019) Atomic cluster expansion for accurate and transferable interatomic potentials Physical Review B 99:014104 Publisher: American Physical Society.
- Musil F, et al. (2021) Physics-Inspired Structural Representations for Molecules and Materials Chemical Reviews 121:9759–9815 Publisher: American Chemical Society.
- Uhrin M (2021) Through the eyes of a descriptor: Constructing complete, invertible descriptions of atomic environments Physical Review B 104:144110 arXiv:2104.09319 [cond-mat].
- (2021) Rotation-Equivariant Deep Learning for Diffusion MRI arXiv:2102.06942 [cs].
- (2022) Predicting Immune Escape with Pretrained Protein Language Model Embeddings (PMLR), pp 110–130 ISSN: 2640-3498.
- Boyd JP, Yu F (2011) Comparing seven spectral methods for interpolation and for solving the Poisson equation in a disk: Zernike polynomials, Logan–Shepp ridge polynomials, Chebyshev–Fourier Series, cylindrical Robert functions, Bessel–Fourier expansions, square-to-disk conformal mapping and radial basis functions Journal of Computational Physics 230:1408–1438.
- Bowman SR, et al. (2016) Generating Sentences from a Continuous Space arXiv:1511.06349 [cs].
- (1996) Effect of data standardization on neural network training Omega 24:385–397.
- (2008) Extracting and composing robust features with denoising autoencoders, ICML ’08 (Association for Computing Machinery, New York, NY, USA), pp 1096–1103.
- Kingma DP, Ba J (2017) Adam: A Method for Stochastic Optimization arXiv:1412.6980 [cs].
- Pedregosa F, et al. (2011) Scikit-learn: Machine Learning in Python Journal of Machine Learning Research 12:2825–2830.
- (2018) DeepDTA: deep drug–target binding affinity prediction Bioinformatics 34:i821–i829.
- (2019) DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks Bioinformatics 35:3329–3338.
- (2019) Cormorant: Covariant Molecular Neural Networks arXiv:1906.04015 [physics, stat].
- Elnaggar A, et al. (2022) ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning IEEE transactions on pattern analysis and machine intelligence 44:7112–7127.
- Gainza P, et al. (2020) Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning Nature Methods 17:184–192 Number: 2 Publisher: Nature Publishing Group.
- Nguyen T, et al. (2021) GraphDTA: predicting drug–target binding affinity with graph neural networks Bioinformatics 37:1140–1147.
- (2021) Learning from Protein Structure with Geometric Vector Perceptrons arXiv:2009.01411 [cs, q-bio, stat].
- (2022) Multi-Scale Representation Learning on Proteins arXiv:2204.02337 [cs, q-bio].
- Aykent S, Xia T (2022) GBPNet | Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
- (2022) When Geometric Deep Learning Meets Pretrained Protein Language Models arXiv:2212.03447 [cs, q-bio].
- Wu F, et al. (2022) Pre-training of Equivariant Graph Matching Networks with Conformation Flexibility for Drug Binding Advanced Science 9:2203796 arXiv:2204.08663 [cs].
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.