Language models in molecular discovery (2309.16235v1)
Abstract: The success of LLMs, especially transformer-based architectures, has trickled into other domains giving rise to "scientific LLMs" that operate on small molecules, proteins or polymers. In chemistry, LLMs contribute to accelerating the molecule discovery cycle as evidenced by promising recent findings in early-stage drug discovery. Here, we review the role of LLMs in molecular discovery, underlining their strength in de novo drug design, property prediction and reaction chemistry. We highlight valuable open-source software assets thus lowering the entry barrier to the field of scientific LLMing. Last, we sketch a vision for future molecular design that combines a chatbot interface with access to computational chemistry tools. Our contribution serves as a valuable resource for researchers, chemists, and AI enthusiasts interested in understanding how LLMs can and will be used to accelerate chemical discovery.