Estimating Statistics on Words Using Ambiguous Descriptions

Nicaud, Cyril

doi:10.4230/LIPIcs.CPM.2016.9

File

Cite AsGet BibTex

Cyril Nicaud. Estimating Statistics on Words Using Ambiguous Descriptions. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 9:1-9:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.CPM.2016.9

Abstract

In this article we propose an alternative way to prove some recent results on statistics on words, such as the expected number of runs or the expected sum of the run exponents. Our approach consists in designing a general framework, based on the symbolic method developped in analytic combinatorics. The descriptions obtained in this framework are built in such a way that the degree of ambiguity of an object O (i.e., the number of different descriptions corresponding to O) is exactly the value of the statistic under study for O. The asymptotic estimation of the expectation is then done using classical techniques from analytic combinatorics. To show the generality of our method, we not only apply it to obtain new proofs of known results but also extend them from the uniform distribution to any memoryless distribution.

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Hideo Bannai, Tomohiro I, Shunsuke Inenaga, Yuto Nakashima, Masayuki Takeda, and Kazuya Tsuruta. The "Runs" Theorem. CoRR, abs/1406.0263, 2014.
Manolis Christodoulakis, Michalis Christou, Maxime Crochemore, and Costas S. Iliopoulos. Abelian borders in binary words. Discrete Applied Mathematics, 171:141-146, 2014.
Manolis Christodoulakis, Michalis Christou, Maxime Crochemore, and Costas S. Iliopoulos. On the average number of regularities in a word. Theoretical Computer Science, 525:3-9, 2014.
Maxime Crochemore and Lucian Ilie. Maximal repetitions in strings. Journal of Computer and Systems Sciences, 74(5):796-807, 2008.
Maxime Crochemore, Lucian Ilie, and Liviu Tinta. The "runs" conjecture. Theoretical Computer Science, 412(27):2931-2941, 2011.
Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press, 2008.
Philippe Flajolet, Wojciech Szpankowski, and Brigitte Vallée. Hidden word statistics. Journal of the ACM, 53(1):147-183, 2006.
Frantisek Franek and Qian Yang. An asymptotic lower bound for the maximal number of runs in a string. Intern. Journal of Foundations Computer Science, 19(1):195-203, 2008.
Kimmo Fredriksson and Szymon Grabowski. Average-optimal string matching. Journal of Discrete Algorithms, 7(4):579-594, 2009.
Pawel Gawrychowski, Gregory Kucherov, Benjamin Sach, and Tatiana A. Starikovskaya. Computing the longest unbordered substring. In Costas S. Iliopoulos, Simon J. Puglisi, and Emine Yilmaz, editors, String Processing and Information Retrieval - 22nd International Symposium, SPIRE 2015, London, UK, September 1-4, 2015, Proceedings, volume 9309 of Lecture Notes in Computer Science, pages 246-257. Springer, 2015.
Amy Glen and Jamie Simpson. The total run length of a word. Theoretical Computer Science, 501:41-48, 2013.
Roman Kolpakov and Gregory Kucherov. Finding maximal repetitions in a word in linear time. In Proceedings of the 1999 Symposium on Foundations of Computer Science (FOCS'99), New York (USA), pages 596-604, New-York, October 17-19 1999. IEEE Computer Society.
Kazuhiko Kusano, Wataru Matsubara, Akira Ishino, and Ayumi Shinohara. Average value of sum of exponents of runs in a string. Intern. Journal of Foundations of Computer Science, 20(06):1135-1146, 2009.
Wataru Matsubara, Kazuhiko Kusano, Akira Ishino, Hideo Bannai, and Ayumi Shinohara. New lower bounds for the maximum number of runs in a string. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2008, Prague, Czech Republic, September 1-3, 2008, pages 140-145, 2008.
Simon J. Puglisi and Jamie Simpson. The expected number of runs in a word. Australasian Journal of Combinatorics, 42:45-54, 2008.
Simon J. Puglisi, Jamie Simpson, and William F. Smyth. How many runs can a string contain? Theoretical Computer Science, 401(1-3):165-171, 2008.
Wojciech Rytter. The number of runs in a string. Information and Computation, 205(9):1459-1469, 2007.
Jamie Simpson. Modified Padovan words and the maximum number of runs in a word. Australasian Journal of Combinatorics, 46:129-145, 2010.

Estimating Statistics on Words Using Ambiguous Descriptions

Author Cyril Nicaud

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

References

Thanks for your feedback!

Could not send message