2000 character limit reached
Construction of minimal DFAs from biological motifs (1004.1298v2)
Published 8 Apr 2010 in cs.FL and q-bio.QM
Abstract: Deterministic finite automata (DFAs) are constructed for various purposes in computational biology. Little attention, however, has been given to the efficient construction of minimal DFAs. In this article, we define simple non-deterministic finite automata (NFAs) and prove that the standard subset construction transforms NFAs of this type into minimal DFAs. Furthermore, we show how simple NFAs can be constructed from two types of patterns popular in bioinformatics, namely (sets of) generalized strings and (generalized) strings with a Hamming neighborhood.