Explain causes of task-specific preferences across molecular SSL methods

Determine the causal mechanisms underlying the observed task-specific performance differences among 3D denoising, 2D graph masking, and 2D–3D contrastive learning pre-training methods, specifically why 3D denoising favors quantum chemical property prediction whereas 2D graph masking and 2D–3D contrastive learning favor biological and physicochemical property prediction in molecular representation learning.

Background

The paper observes that different self-supervised pre-training strategies on molecular data tend to perform better on different categories of downstream tasks: 3D denoising excels on quantum chemical properties, while 2D graph masking and 2D–3D contrastive learning perform better on biological and physicochemical properties.

Although the work offers a unified view of these methods via contrastive learning and clustering, the authors explicitly state that the underlying causes for this phenomenon remain unclear, identifying a concrete gap in understanding that affects the design and selection of pre-training objectives.

References

Generally, 3D denoising methods favor quantum chemical property prediction, while 2D graph masking and 2D-3D contrastive learning prefer biological and physicochemical property prediction. This phenomenon, also manifested in section~\ref{sec:main exp}, is hardly discussed by previous studies and the causes are still unclear.

UniCorn: A Unified Contrastive Learning Approach for Multi-view Molecular Representation Learning (2405.10343 - Feng et al., 15 May 2024) in Introduction (Section 1)