On Computing the Total Variation Distance of Hidden Markov Models

Author Stefan Kiefer

Thumbnail PDF


  • Filesize: 416 kB
  • 13 pages

Document Identifiers

Author Details

Stefan Kiefer
  • University of Oxford, United Kingdom

Cite AsGet BibTex

Stefan Kiefer. On Computing the Total Variation Distance of Hidden Markov Models. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 130:1-130:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


We prove results on the decidability and complexity of computing the total variation distance (equivalently, the L_1-distance) of hidden Markov models (equivalently, labelled Markov chains). This distance measures the difference between the distributions on words that two hidden Markov models induce. The main results are: (1) it is undecidable whether the distance is greater than a given threshold; (2) approximation is #P-hard and in PSPACE.

Subject Classification

ACM Subject Classification
  • Theory of computation → Probabilistic computation
  • Theory of computation → Random walks and Markov chains
  • Labelled Markov Chains
  • Hidden Markov Models
  • Distance
  • Decidability
  • Complexity


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. P. Ailliot, C. Thompson, and P. Thomson. Space-time modelling of precipitation by using a hidden Markov model and censored Gaussian distributions. Journal of the Royal Statistical Society, 58(3):405-426, 2009. Google Scholar
  2. M. Alexandersson, S. Cawley, and L. Pachter. SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Research, 13:469-502, 2003. Google Scholar
  3. D. Chen, F. van Breugel, and J. Worrell. On the complexity of computing probabilistic bisimilarity. In Proceedings of FoSSaCS, volume 7213 of LNCS, pages 437-451. Springer, 2012. Google Scholar
  4. F.-S. Chen, C.-M. Fu, and C.-L. Huang. Hand gesture recognition using a real-time tracking method and hidden Markov models. Image and Vision Computing, 21(8):745-758, 2003. Google Scholar
  5. T. Chen and S. Kiefer. On the total variation distance of labelled Markov chains. In Proceedings of CSL-LICS, pages 33:1-33:10, 2014. Google Scholar
  6. G.A. Churchill. Stochastic models for heterogeneous DNA sequences. Bulletin of Mathematical Biology, 51(1):79-94, 1989. Google Scholar
  7. C. Cortes, M. Mohri, and A. Rastogi. L_p distance and equivalence of probabilistic automata. International Journal of Foundations of Computer Science, 18(04):761-779, 2007. Google Scholar
  8. M.S. Crouse, R.D. Nowak, and R.G. Baraniuk. Wavelet-based statistical signal processing using hidden Markov models. IEEE Transactions on Signal Processing, 46(4):886-902, April 1998. Google Scholar
  9. C. Dehnert, S. Junges, J.-P. Katoen, and M. Volk. A Storm is coming: A modern probabilistic model checker. In Proceedings of Computer Aided Verification (CAV), pages 592-600. Springer, 2017. Google Scholar
  10. J. Desharnais, V. Gupta, R. Jagadeesan, and P. Panangaden. Metrics for labelled Markov processes. Theoretical Computer Science, 318(3):323-354, 2004. Google Scholar
  11. R. Durbin. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, 1998. Google Scholar
  12. S.R. Eddy. What is a hidden Markov model? Nature Biotechnology, 22(10):1315-1316, October 2004. Google Scholar
  13. N. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, second edition, 2002. Google Scholar
  14. S. Kannan, Z. Sweedyk, and S. Mahaney. Counting and random generation of strings in regular languages. In Proceedings of SODA, pages 551-557, 1995. Google Scholar
  15. S. Kiefer. On computing the total variation distance of hidden Markov models. Technical report, arxiv.org, 2018. Available at ěrb|https://arxiv.org/abs/1804.06170|. Google Scholar
  16. S. Kiefer, A.S. Murawski, J. Ouaknine, B. Wachter, and J. Worrell. Language equivalence for probabilistic automata. In Proceedings of Computer Aided Verification (CAV), volume 6806 of LNCS, pages 526-540. Springer, 2011. Google Scholar
  17. A. Krogh, B. Larsson, G. von Heijne, and E.L.L. Sonnhammer. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. Journal of Molecular Biology, 305(3):567-580, 2001. Google Scholar
  18. M. Kwiatkowska, G. Norman, and D. Parker. PRISM 4.0: Verification of probabilistic real-time systems. In Proceedings of Computer Aided Verification (CAV), volume 6806 of LNCS, pages 585-591. Springer, 2011. Google Scholar
  19. R. E. Ladner. Polynomial space counting problems. SIAM Journal on Computing, 18(6):1087-1097, 1989. Google Scholar
  20. R.B. Lyngsø and C.N.S. Pedersen. The consensus string problem and the complexity of comparing hidden Markov models. J. Comput. Syst. Sci., 65(3):545-569, 2002. Google Scholar
  21. M. Mitzenmacher and E. Upfal. Probability and Computing. Cambridge University Press, 2005. Google Scholar
  22. A. Paz. Introduction to Probabilistic Automata. Academic Press, 1971. Google Scholar
  23. L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257-286, 1989. Google Scholar
  24. M.-P. Schützenberger. On the definition of a family of automata. Inf. and Control, 4:245-270, 1961. Google Scholar
  25. S. Toda. PP is as hard as the polynomial-time hierarchy. SIAM Journal of Computing, 20(5):865-877, 1991. Google Scholar