Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recent advances in the Self-Referencing Embedding Strings (SELFIES) library (2302.03620v1)

Published 7 Feb 2023 in physics.chem-ph and cs.LG

Abstract: String-based molecular representations play a crucial role in cheminformatics applications, and with the growing success of deep learning in chemistry, have been readily adopted into machine learning pipelines. However, traditional string-based representations such as SMILES are often prone to syntactic and semantic errors when produced by generative models. To address these problems, a novel representation, SELF-referencIng Embedded Strings (SELFIES), was proposed that is inherently 100% robust, alongside an accompanying open-source implementation. Since then, we have generalized SELFIES to support a wider range of molecules and semantic constraints and streamlined its underlying grammar. We have implemented this updated representation in subsequent versions of \selfieslib, where we have also made major advances with respect to design, efficiency, and supported features. Hence, we present the current status of \selfieslib (version 2.1.1) in this manuscript.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Alston Lo (6 papers)
  2. Robert Pollice (8 papers)
  3. AkshatKumar Nigam (10 papers)
  4. Andrew D. White (29 papers)
  5. Mario Krenn (74 papers)
  6. Alán Aspuru-Guzik (227 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.