Dice Question Streamline Icon: https://streamlinehq.com

Performance of baseline and synthesizability-constrained models when adding QED and SA to the objective

Determine the performance of GraphGA, SyntheMol, Fragment-based GFlowNet (FGFN), and Reaction-GFlowNet (RGFN) on the ATP-dependent Clp protease proteolytic subunit (ClpP) docking task when the optimization objective is modified to jointly include the quantitative estimate of drug-likeness (QED) and the synthetic accessibility (SA) score in addition to docking score, rather than optimizing docking score alone.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper compares Saturn against several models reported in the RGFN work on a docking case paper to ClpP. In the original RGFN setup, the optimization objective focuses solely on docking score, whereas evaluation also reports QED and SA scores. The authors note that docking alone can be exploitable and argue that downstream metrics should be included directly in the multi-parameter optimization (MPO) objective.

They explicitly state uncertainty about how the compared models would perform if QED and SA were part of the objective being optimized, motivating a clearer comparison under multi-parameter objectives that better reflect drug-likeness and synthesizability constraints.

References

The RGFN work (which also reports results for GraphGA, SyntheMol, and FGFN), defines the objective function to only optimize for docking score, but assesses generated molecules also by their QED and SA scores. It is unclear the performance of these models if the objective function were modified to also enforce these properties.

Directly Optimizing for Synthesizability in Generative Molecular Design using Retrosynthesis Models (2407.12186 - Guo et al., 16 Jul 2024) in Methods, Experimental Caveats, Objective Function (Section 3; Item 4)