Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3D Denoisers are Good 2D Teachers: Molecular Pretraining via Denoising and Cross-Modal Distillation (2309.04062v1)

Published 8 Sep 2023 in cs.LG, cs.AI, and physics.chem-ph

Abstract: Pretraining molecular representations from large unlabeled data is essential for molecular property prediction due to the high cost of obtaining ground-truth labels. While there exist various 2D graph-based molecular pretraining approaches, these methods struggle to show statistically significant gains in predictive performance. Recent work have thus instead proposed 3D conformer-based pretraining under the task of denoising, which led to promising results. During downstream finetuning, however, models trained with 3D conformers require accurate atom-coordinates of previously unseen molecules, which are computationally expensive to acquire at scale. In light of this limitation, we propose D&D, a self-supervised molecular representation learning framework that pretrains a 2D graph encoder by distilling representations from a 3D denoiser. With denoising followed by cross-modal knowledge distillation, our approach enjoys use of knowledge obtained from denoising as well as painless application to downstream tasks with no access to accurate conformers. Experiments on real-world molecular property prediction datasets show that the graph encoder trained via D&D can infer 3D information based on the 2D graph and shows superior performance and label-efficiency against other baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Sungjun Cho (18 papers)
  2. Dae-Woong Jeong (7 papers)
  3. Sung Moon Ko (9 papers)
  4. Jinwoo Kim (40 papers)
  5. Sehui Han (11 papers)
  6. Seunghoon Hong (41 papers)
  7. Honglak Lee (174 papers)
  8. Moontae Lee (54 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.