Geometrically Aligned Transfer Encoder for Inductive Transfer in Regression Tasks (2310.06369v1)
Abstract: Transfer learning is a crucial technique for handling a small amount of data that is potentially related to other abundant data. However, most of the existing methods are focused on classification tasks using images and language datasets. Therefore, in order to expand the transfer learning scheme to regression tasks, we propose a novel transfer technique based on differential geometry, namely the Geometrically Aligned Transfer Encoder (GATE). In this method, we interpret the latent vectors from the model to exist on a Riemannian curved manifold. We find a proper diffeomorphism between pairs of tasks to ensure that every arbitrary point maps to a locally flat coordinate in the overlapping region, allowing the transfer of knowledge from the source to the target data. This also serves as an effective regularizer for the model to behave in extrapolation regions. In this article, we demonstrate that GATE outperforms conventional methods and exhibits stable behavior in both the latent space and extrapolation regions for various molecular graph datasets.
- Exploiting associations between word clusters and document classes for cross-domain text categorization†. Statistical Analysis and Data Mining: The ASA Data Science Journal, 4(1):100–114, 2011.
- Dual Transfer Learning, pages 540–551.
- Triplex transfer learning: Exploiting both shared and distinct concepts for text classification. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM ’13, page 425–434, New York, NY, USA, 2013. Association for Computing Machinery.
- Triplex transfer learning: Exploiting both shared and distinct concepts for text classification. IEEE Transactions on Cybernetics, 44(7):1191–1203, 2014.
- Multi-group transfer learning on multiple latent spaces for text classification. IEEE Access, 8:64120–64130, 2020.
- Transfer learning for image classification with sparse prototype representations. Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 03 2008.
- What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. CVPR 2011, pages 1785–1792, 2011.
- Transfusion: Understanding transfer learning with applications to medical imaging. CoRR, abs/1902.07208, 2019.
- Transfer learning for medical images analyses: A survey. Neurocomputing, 489:230–254, 2022.
- Data denoising with transfer learning in single-cell transcriptomics. Nature Methods, 16:875–878, 09 2019.
- Integration and transfer learning of single-cell transcriptomes via cfit. Proceedings of the National Academy of Sciences, 118(10):e2024383118, 2021.
- The graph neural network model. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council, 20:61–80, 01 2009.
- Spectral networks and locally connected networks on graphs. 12 2013.
- Convolutional networks on graphs for learning molecular fingerprints. Advances in Neural Information Processing Systems (NIPS), 13, 09 2015.
- Convolutional neural networks on graphs with fast localized spectral filtering. 06 2016.
- Learning multimodal graph-to-graph translation for molecular optimization, 12 2018.
- A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci., 10:370–377, 2019.
- Grouping matrix based graph pooling with adaptive number of clusters. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 8334–8342, 2023.
- A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009.
- Transfer learning. Cambridge University Press, 2020.
- Representation subspace distance for domain adaptation regression. In ICML, pages 1749–1759, 2021.
- Inductive transfer learning for molecular activity prediction: Next-gen qsar models with molpmofit. Journal of Cheminformatics, 12(1):1–15, 2020.
- Transfer learning on large datasets for the accurate prediction of material properties. arXiv preprint arXiv:2303.03000, 2023.
- Rich Caruana. Multitask learning. Machine learning, 28:41–75, 1997.
- In-silico molecular binding prediction for human drug targets using deep neural multi-task learning. Genes, 10(11):906, 2019.
- Structured multi-task learning for molecular property prediction. In International conference on artificial intelligence and statistics, pages 8906–8920. PMLR, 2022.
- Knowledge distillation: A survey. International Journal of Computer Vision, 129:1789–1819, 2021.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- On representation knowledge distillation for graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Coordinating cross-modal distillation for molecular property prediction. arXiv preprint arXiv:2211.16712, 2022.
- Molkd: Distilling cross-modal knowledge in chemical reactions for molecular property prediction. arXiv preprint arXiv:2305.01912, 2023.
- Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
- Domain adaptation for object recognition: An unsupervised approach. In 2011 international conference on computer vision, pages 999–1006. IEEE, 2011.
- Geodesic flow kernel for unsupervised domain adaptation. In 2012 IEEE conference on computer vision and pattern recognition, pages 2066–2073. IEEE, 2012.
- Domain adaptation on the statistical manifold. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2481–2488, 2014.
- Unsupervised domain adaptation via discriminative manifold embedding and alignment. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5029–5036, 2020.
- David Weininger. Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36, 1988.
- Analyzing learned molecular representations for property prediction. Journal of Chemical Information and Modeling, 59, 07 2019.
- PubChem 2023 update. Nucleic Acids Research, 51(D1):D1373–D1380, 10 2022.
- Online chemical modeling environment (ochem): web platform for data storage, model development and publishing of chemical information. Journal of computer-aided molecular design, 25:533–554, 2011.
- Russell D. Johnson III. Nist computational chemistry comparison and benchmark database. NIST Standard Reference Database, 101, 2022.
- The properties of known drugs. 1. molecular frameworks. Journal of medicinal chemistry, 39(15):2887–2893, 1996.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.