Minimum Probabilistic Finite State Learning Problem on Finite Data Sets: Complexity, Solution and Approximations (1501.01300v2)
Abstract: In this paper, we study the problem of determining a minimum state probabilistic finite state machine capable of generating statistically identical symbol sequences to samples provided. This problem is qualitatively similar to the classical Hidden Markov Model problem and has been studied from a practical point of view in several works beginning with the work presented in: Shalizi, C.R., Shalizi, K.L., Crutchfield, J.P. (2002) \textit{An algorithm for pattern discovery in time series.} Technical Report 02-10-060, Santa Fe Institute. arxiv.org/abs/cs.LG/0210025. We show that the underlying problem is $\mathrm{NP}$-hard and thus all existing polynomial time algorithms must be approximations on finite data sets. Using our $\mathrm{NP}$-hardness proof, we show how to construct a provably correct algorithm for constructing a minimum state probabilistic finite state machine given data and empirically study its running time.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.