Training-set overlap of AbLang2 and ESM3 with the OAS sequence inpainting test set

Determine whether the 2,000-sequence holdout test set of paired human antibodies from the Observed Antibody Space (OAS) used for sequence inpainting experiments overlaps with the training datasets of AbLang2 and ESM3, to clarify potential training–test leakage in the reported comparisons.

Background

In the sequence inpainting evaluation, the authors compare IgCraft against AbLang2 and ESM3 using a holdout set of 2,000 paired antibody sequences from OAS. They detail sampling setups and note that ESM3-open requires single-chain inputs, for which heavy and light chains are concatenated using a (G4S)3 linker.

The authors explicitly state that they could not determine whether the test set overlaps with the training sets of AbLang2 and ESM3. Establishing this overlap is important for fair benchmarking, as training–test leakage could inflate performance metrics such as amino acid recovery.

References

We were not able to determine if this test set overlaps with the training sets of AbLang2 and ESM3.

IgCraft: A versatile sequence generation framework for antibody discovery and engineering (2503.19821 - Greenig et al., 25 Mar 2025) in Results, Section “Sequence inpainting,” Table caption for mean amino acid recovery (AAR) (label: table:inpainting)