Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Supervised Pretraining for Molecular Force Fields and Properties Prediction (2211.14429v1)

Published 23 Nov 2022 in physics.chem-ph, cs.LG, and q-bio.BM

Abstract: Machine learning approaches have become popular for molecular modeling tasks, including molecular force fields and properties prediction. Traditional supervised learning methods suffer from scarcity of labeled data for particular tasks, motivating the use of large-scale dataset for other relevant tasks. We propose to pretrain neural networks on a dataset of 86 millions of molecules with atom charges and 3D geometries as inputs and molecular energies as labels. Experiments show that, compared to training from scratch, fine-tuning the pretrained model can significantly improve the performance for seven molecular property prediction tasks and two force field tasks. We also demonstrate that the learned representations from the pretrained model contain adequate information about molecular structures, by showing that linear probing of the representations can predict many molecular information including atom types, interatomic distances, class of molecular scaffolds, and existence of molecular fragments. Our results show that supervised pretraining is a promising research direction in molecular modeling

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiang Gao (210 papers)
  2. Weihao Gao (30 papers)
  3. Wenzhi Xiao (3 papers)
  4. Zhirui Wang (18 papers)
  5. Chong Wang (308 papers)
  6. Liang Xiang (30 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.