Do complex attention layers in molecular foundation models reflect true biology or overfitting?

Determine whether the complex attention layers learned by attention-based molecular foundation models for gene expression, such as Geneformer, reflect genuine biological complexity rather than overfitting to training data by developing and applying dedicated benchmarks that test whether these models capture generalisable causal representations of biological systems.

Background

Within the discussion of attention-based molecular foundation models (e.g., Geneformer), the authors note that while some attention layers display interpretable patterns (e.g., attending to highly connected or highly expressed genes), other layers are difficult to interpret. They raise the concern that such complexity may reflect overfitting rather than true biological structure.

They point to poor generalisation in independent benchmarks and emphasise the need for dedicated evaluations to ascertain whether these models have learned causal, generalisable representations of biology, explicitly stating uncertainty about the nature of these complex layers.

References

Whether these complex layers reflect the true complexity of the underlying biology or are rather evidence for overfitting to the training data is not clear. To determine whether molecular foundation models indeed capture generalisable causal representations of biology, dedicated benchmarks are needed.

— Molecular causality in the advent of foundation models (2401.09558 - Lobentanzer et al., 17 Jan 2024) in Section “Causality in foundation models,” Benchmarking paragraph

Do complex attention layers in molecular foundation models reflect true biology or overfitting?

Background

References

Related Problems