Tailoring xLSTM for biological and chemical sequences and benchmarking against domain-specific LLMs
Determine how to adapt the xLSTM architecture and training procedures specifically for biological and chemical sequence modeling (including genomic DNA, protein sequences, and SMILES representations of small molecules), and ascertain how xLSTM-based models compare in performance to other domain-specific large language model architectures across these modalities.
References
However, it remains unclear how to best tailor xLSTM for biological and chemical sequences and how xLSTM compares to other domain-specific LLM architectures.
                — Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
                
                (2411.04165 - Schmidinger et al., 6 Nov 2024) in Section 1, Introduction