Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Hidden Markov Models Using Conditional Samples (2302.14753v2)

Published 28 Feb 2023 in cs.LG, cs.AI, and stat.ML

Abstract: This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM). Although HMMs are some of the most widely used tools in sequential and time series modeling, they are cryptographically hard to learn in the standard setting where one has access to i.i.d. samples of observation sequences. In this paper, we depart from this setup and consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs. We show that interactive access to the HMM enables computationally efficient learning algorithms, thereby bypassing cryptographic hardness. Specifically, we obtain efficient algorithms for learning HMMs in two settings: (a) An easier setting where we have query access to the exact conditional probabilities. Here our algorithm runs in polynomial time and makes polynomially many queries to approximate any HMM in total variation distance. (b) A harder setting where we can only obtain samples from the conditional distributions. Here the performance of the algorithm depends on a new parameter, called the fidelity of the HMM. We show that this captures cryptographically hard instances and previously known positive results. We also show that these results extend to a broader class of distributions with latent low rank structure. Our algorithms can be viewed as generalizations and robustifications of Angluin's $L*$ algorithm for learning deterministic finite automata from membership queries.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Michael Alekhnovich “More on average case vs approximation complexity” In Symposium on Foundations of Computer Science, 2003
  2. “Tensor decompositions for learning latent variable models” In Journal of machine learning research, 2014
  3. Dana Angluin “Learning regular sets from queries and counterexamples” In Information and computation, 1987
  4. “Property testing of joint distributions using conditional samples” In ACM Transactions on Computation Theory, 2018
  5. “Cryptographic primitives based on hard learning problems” In Advances in Cryptology, 1994
  6. “Weakly learning DNF and characterizing statistical query learning using Fourier analysis” In Symposium on Theory of Computing, 1994
  7. Mary Cryan, Leslie Ann Goldberg and Paul W Goldberg “Evolutionary trees can be learned in polynomial time in the two-state general Markov model” In SIAM Journal on Computing, 2001
  8. “On the power of conditional samples in distribution testing” In Conference on Innovations in Theoretical Computer Science, 2013
  9. “Learning and testing junta distributions with sub cube conditioning” In Conference on Learning Theory, 2021
  10. “Testing probability distributions underlying aggregated data” In International Colloquium on Automata, Languages, and Programming, 2014, pp. 283–295 Springer
  11. Clément L Canonne, Dana Ron and Rocco A Servedio “Testing probability distributions using conditional samples” In SIAM Journal on Computing, 2015
  12. Varsha Dani, Thomas P Hayes and Sham M Kakade “Stochastic linear optimization under bandit feedback” In Conference on Learning Theory, 2008
  13. Daniel Hsu, Sham M Kakade and Tong Zhang “A spectral algorithm for learning hidden Markov models” In Journal of Computer and System Sciences, 2012
  14. “Minimal realization problems for hidden markov models” In IEEE Transactions on Signal Processing, 2015
  15. Herbert Jaeger “Observable operator models for discrete stochastic time series” In Neural Computation, 2000
  16. “On the learnability of discrete distributions” In Symposium on Theory of Computing, 1994
  17. Aryeh Kontorovich, Boaz Nadler and Roi Weiss “On learning parametric-output HMMs” In International Conference on Machine Learning, 2013
  18. “Learning nonsingular phylogenies and hidden Markov models” In Symposium on Theory of Computing, 2005
  19. Tianyi Peng “Bound on difference of eigen projections of positive definite matrices”, Mathematics Stack Exchange, 2020 URL: https://math.stackexchange.com/q/3921839
  20. Stéphane Ross, Geoffrey Gordon and Drew Bagnell “A reduction of imitation learning and structured prediction to no-regret online learning” In International Conference on Artificial Intelligence and Statistics, 2011
  21. “Learning overcomplete HMMs” In Advances in Neural Information Processing Systems, 2017
  22. “Learning parametric-output hmms with two aliased states” In International Conference on Machine Learning, 2015
  23. Zhisong Zhang, Emma Strubell and Eduard Hovy “A survey of active learning for natural language processing” In arXiv:2210.10109, 2022
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Sham M. Kakade (88 papers)
  2. Akshay Krishnamurthy (92 papers)
  3. Gaurav Mahajan (13 papers)
  4. Cyril Zhang (34 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com