Training-set overlap of AbLang2 and ESM3 with the OAS sequence inpainting test set
Determine whether the 2,000-sequence holdout test set of paired human antibodies from the Observed Antibody Space (OAS) used for sequence inpainting experiments overlaps with the training datasets of AbLang2 and ESM3, to clarify potential training–test leakage in the reported comparisons.
Sponsor
References
We were not able to determine if this test set overlaps with the training sets of AbLang2 and ESM3.
— IgCraft: A versatile sequence generation framework for antibody discovery and engineering
(2503.19821 - Greenig et al., 25 Mar 2025) in Results, Section “Sequence inpainting,” Table caption for mean amino acid recovery (AAR) (label: table:inpainting)