2000 character limit reached
Learning Several Languages from Labeled Strings: State Merging and Evolutionary Approaches (1806.01630v2)
Published 5 Jun 2018 in cs.FL
Abstract: The problem of learning pairwise disjoint deterministic finite automata (DFA) from positive examples has been recently addressed. In this paper, we address the problem of identifying a set of DFAs from labeled strings and come up with two methods. The first is based on state merging and a heuristic related to the size of each state merging iteration. State merging operations involving a large number of states are extracted, to provide sub-DFAs. The second method is based on a multi-objective evolutionary algorithm whose fitness function takes into account the accuracy of the DFA w.r.t. the learning sample, as well as the desired number of DFAs. We evaluate our methods on a dataset originated from industry.