2000 character limit reached
scInterpreter: Training Large Language Models to Interpret scRNA-seq Data for Cell Type Annotation (2402.12405v1)
Published 18 Feb 2024 in q-bio.GN and cs.AI
Abstract: Despite the inherent limitations of existing LLMs in directly reading and interpreting single-cell omics data, they demonstrate significant potential and flexibility as the Foundation Model. This research focuses on how to train and adapt the LLM with the capability to interpret and distinguish cell types in single-cell RNA sequencing data. Our preliminary research results indicate that these foundational models excel in accurately categorizing known cell types, demonstrating the potential of the LLMs as effective tools for uncovering new biological insights.
- scgpt: Towards building a foundation model for single-cell multi-omics using generative ai. bioRxiv, 2023, 2023–04
- Large scale foundation model on single-cell transcriptomics. bioRxiv, 2023, 2023–05
- Transfer learning enables predictions in network biology. Nature, 2023, 1–9
- Genecompass: Deciphering universal gene regulatory mechanisms with knowledge-informed cross-species foundation model. bioRxiv, 2023, 2023–09
- Chen Y T, Zou J. Genept: A simple but hard-to-beat foundation model for genes and cells built from chatgpt. bioRxiv, 2023, 2023–10
- A comprehensive capability analysis of gpt-3 and gpt-3.5 series models. arXiv preprint arXiv:2303.10420, 2023
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023