Essential reason for long-sequence training instability
Ascertain the essential reason underlying the correlation between long biological sequence lengths and instability during training of foundation models for biological sequences, in order to enable stable training without sacrificing information through truncation or other ad hoc procedures.
References
The essential reason for the correlation between long sequences and training instability has not been completely deciphered.
— Progress and Opportunities of Foundation Models in Bioinformatics
(2402.04286 - Li et al., 6 Feb 2024) in Challenges, subsection “Long sequence length”