Estimating Statistics on Words Using Ambiguous Descriptions

Author Cyril Nicaud

Thumbnail PDF


  • Filesize: 479 kB
  • 12 pages

Document Identifiers

Author Details

Cyril Nicaud

Cite AsGet BibTex

Cyril Nicaud. Estimating Statistics on Words Using Ambiguous Descriptions. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 9:1-9:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


In this article we propose an alternative way to prove some recent results on statistics on words, such as the expected number of runs or the expected sum of the run exponents. Our approach consists in designing a general framework, based on the symbolic method developped in analytic combinatorics. The descriptions obtained in this framework are built in such a way that the degree of ambiguity of an object O (i.e., the number of different descriptions corresponding to O) is exactly the value of the statistic under study for O. The asymptotic estimation of the expectation is then done using classical techniques from analytic combinatorics. To show the generality of our method, we not only apply it to obtain new proofs of known results but also extend them from the uniform distribution to any memoryless distribution.
  • random words
  • runs
  • symbolic method
  • analytic combinatorics


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "Runs" Theorem. CoRR, abs/1406.0263, 2014. Google Scholar
  2. Manolis Christodoulakis, Michalis Christou, Maxime Crochemore, and Costas S. Iliopoulos. Abelian borders in binary words. Discrete Applied Mathematics, 171:141-146, 2014. Google Scholar
  3. Manolis Christodoulakis, Michalis Christou, Maxime Crochemore, and Costas S. Iliopoulos. On the average number of regularities in a word. Theoretical Computer Science, 525:3-9, 2014. Google Scholar
  4. Maxime Crochemore and Lucian Ilie. Maximal repetitions in strings. Journal of Computer and Systems Sciences, 74(5):796-807, 2008. Google Scholar
  5. Maxime Crochemore, Lucian Ilie, and Liviu Tinta. The "runs" conjecture. Theoretical Computer Science, 412(27):2931-2941, 2011. Google Scholar
  6. Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, 2008. Google Scholar
  7. Philippe Flajolet, Wojciech Szpankowski, and Brigitte Vallée. Hidden word statistics. Journal of the ACM, 53(1):147-183, 2006. Google Scholar
  8. Frantisek Franek and Qian Yang. An asymptotic lower bound for the maximal number of runs in a string. Intern. Journal of Foundations Computer Science, 19(1):195-203, 2008. Google Scholar
  9. Kimmo Fredriksson and Szymon Grabowski. Average-optimal string matching. Journal of Discrete Algorithms, 7(4):579-594, 2009. Google Scholar
  10. Pawel Gawrychowski, Gregory Kucherov, Benjamin Sach, and Tatiana A. Starikovskaya. Computing the longest unbordered substring. In Costas S. Iliopoulos, Simon J. Puglisi, and Emine Yilmaz, editors, String Processing and Information Retrieval - 22nd International Symposium, SPIRE 2015, London, UK, September 1-4, 2015, Proceedings, volume 9309 of Lecture Notes in Computer Science, pages 246-257. Springer, 2015. Google Scholar
  11. Amy Glen and Jamie Simpson. The total run length of a word. Theoretical Computer Science, 501:41-48, 2013. Google Scholar
  12. Roman Kolpakov and Gregory Kucherov. Finding maximal repetitions in a word in linear time. In Proceedings of the 1999 Symposium on Foundations of Computer Science (FOCS'99), New York (USA), pages 596-604, New-York, October 17-19 1999. IEEE Computer Society. Google Scholar
  13. Kazuhiko Kusano, Wataru Matsubara, Akira Ishino, and Ayumi Shinohara. Average value of sum of exponents of runs in a string. Intern. Journal of Foundations of Computer Science, 20(06):1135-1146, 2009. Google Scholar
  14. Wataru Matsubara, Kazuhiko Kusano, Akira Ishino, Hideo Bannai, and Ayumi Shinohara. New lower bounds for the maximum number of runs in a string. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2008, Prague, Czech Republic, September 1-3, 2008, pages 140-145, 2008. Google Scholar
  15. Simon J. Puglisi and Jamie Simpson. The expected number of runs in a word. Australasian Journal of Combinatorics, 42:45-54, 2008. Google Scholar
  16. Simon J. Puglisi, Jamie Simpson, and William F. Smyth. How many runs can a string contain? Theoretical Computer Science, 401(1-3):165-171, 2008. Google Scholar
  17. Wojciech Rytter. The number of runs in a string. Information and Computation, 205(9):1459-1469, 2007. Google Scholar
  18. Jamie Simpson. Modified Padovan words and the maximum number of runs in a word. Australasian Journal of Combinatorics, 46:129-145, 2010. Google Scholar