2000 character limit reached
Transformer Based Molecule Encoding for Property Prediction (2011.03518v2)
Published 5 Nov 2020 in q-bio.QM
Abstract: Neural methods of molecule property prediction require efficient encoding of structure and property relationship to be accurate. Recent work using graph algorithms shows limited generalization in the latent molecule encoding space. We build a Transformer-based molecule encoder and property predictor network with novel input featurization that performs significantly better than existing methods. We adapt our model to semi-supervised learning to further perform well on the limited experimental data usually available in practice.