Dichotomic Selection on Words: A Probabilistic Analysis

Authors Ali Akhavi, Julien Clément, Dimitri Darthenay, Loïck Lhote, Brigitte Vallée



PDF
Thumbnail PDF

File

LIPIcs.CPM.2019.19.pdf
  • Filesize: 0.68 MB
  • 19 pages

Document Identifiers

Author Details

Ali Akhavi
  • GREYC (Normandie Université, Unicaen, EnsiCaen, Cnrs), 14000, Caen, France
Julien Clément
  • GREYC (Normandie Université, Unicaen, EnsiCaen, Cnrs), 14000, Caen, France
Dimitri Darthenay
  • GREYC (Normandie Université, Unicaen, EnsiCaen, Cnrs), 14000, Caen, France
Loïck Lhote
  • GREYC (Normandie Université, Unicaen, EnsiCaen, Cnrs), 14000, Caen, France
Brigitte Vallée
  • GREYC (Normandie Université, Unicaen, EnsiCaen, Cnrs), 14000, Caen, France

Acknowledgements

We thank Pablo Rotondo for many interesting discussions. We also thank anonymous reviewers for pointing us to bibliographic references.

Cite AsGet BibTex

Ali Akhavi, Julien Clément, Dimitri Darthenay, Loïck Lhote, and Brigitte Vallée. Dichotomic Selection on Words: A Probabilistic Analysis. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 19:1-19:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CPM.2019.19

Abstract

The paper studies the behaviour of selection algorithms that are based on dichotomy principles. On the entry formed by an ordered list L and a searched element x not in L, they return the interval of the list L the element x belongs to. We focus here on the case of words, where dichotomy principles lead to a selection algorithm designed by Crochemore, Hancart and Lecroq, which appears to be "quasi-optimal". We perform a probabilistic analysis of this algorithm that exhibits its quasi-optimality on average.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
  • Theory of computation → Randomness, geometry and discrete structures
  • Theory of computation → Sorting and searching
  • Theory of computation → Pattern matching
  • Mathematics of computing → Combinatorics on words
  • Mathematics of computing → Discrete mathematics
Keywords
  • dichotomic selection
  • text algorithms
  • analysis of algorithms
  • average case analysis of algorithms
  • trie
  • suffix array
  • lcp-array
  • information theory
  • numeration process
  • sources
  • entropy
  • coincidence
  • analytic combinatorics
  • depoissonization techniques

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Arne Andersson, Torben Hagerup, Johan Håstad, and Ola Petersson. Tight Bounds for Searching a Sorted Array of Strings. SIAM J. Comput., 30(5):1552-1578, 2000. URL: http://dx.doi.org/10.1137/S0097539797329889.
  2. Alberto Apostolico, Maxime Crochemore, Martin Farach-Colton, Zvi Galil, and S. Muthukrishnan. 40 Years of Suffix Trees. Commun. ACM, 59(4):66-73, March 2016. URL: http://dx.doi.org/10.1145/2810036.
  3. Wieb Bosma, Karma Dajani, and Cor Kraaikamp. Entropy quotients and correct digits in number-theoretic expansions. In Dynamics &stochastics, volume 48 of IMS Lecture Notes Monogr. Ser., pages 176-188. Inst. Math. Statist., Beachwood, OH, 2006. URL: http://dx.doi.org/10.1214/074921706000000202.
  4. Julien Clément, James Allen Fill, Thu Hien Nguyen Thi, and Brigitte Vallée. Towards a Realistic Analysis of the QuickSelect Algorithm. Theory of Computing Systems, 58(4):528-578, May 2016. URL: http://dx.doi.org/10.1007/s00224-015-9633-5.
  5. Julien Clément, Philippe Flajolet, and Brigitte Vallée. Dynamical sources in information theory: A general analysis of trie structures. Algorithmica, 29(1):307-369, February 2001. URL: http://dx.doi.org/10.1007/BF02679623.
  6. Julien Clément, Thu-Hien Nguyen-Thi, and Brigitte Vallée. Towards a Realistic Analysis of Some Popular Sorting Algorithms. Combinatorics, Probability and Computing, 24(1):104-144, 2015. URL: http://dx.doi.org/10.1017/S0963548314000649.
  7. Maxime Crochemore, Christophe Hancart, and Thierry Lecroq. Algorithms on Strings. Cambridge University Press, New York, NY, USA, 2007. Google Scholar
  8. Karma Dajani and Adam Fieldsteel. Equipartition of interval partitions and an application to number theory. Proc. Amer. Math. Soc., 129(12):3453-3460, 2001. URL: http://dx.doi.org/10.1090/S0002-9939-01-06299-2.
  9. Luc Devroye. A Probabilistic Analysis of the Height of Tries and of the Complexity of Triesort. Acta Inf., 21(3):229-237, October 1984. URL: http://dx.doi.org/10.1007/BF00264248.
  10. Johannes Fischer. Inducing the LCP-Array. In Frank Dehne, John Iacono, and Jörg-Rüdiger Sack, editors, Algorithms and Data Structures, pages 374-385, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg. Google Scholar
  11. Gianni Franceschini and Roberto Grossi. No Sorting? Better Searching! In 45th Symposium on Foundations of Computer Science (FOCS 2004), 17-19 October 2004, Rome, Italy, Proceedings, pages 491-498, 2004. URL: http://dx.doi.org/10.1109/FOCS.2004.43.
  12. Simon Gog and Enno Ohlebusch. Fast and Lightweight LCP-array Construction Algorithms. In Proceedings of the Meeting on Algorithm Engineering &Expermiments, ALENEX '11, pages 25-34, Philadelphia, PA, USA, 2011. Society for Industrial and Applied Mathematics. Google Scholar
  13. G. H. Gonnet and R. Baeza-Yates. Handbook of Algorithms and Data Structures: In Pascal and C (2Nd Ed.). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1991. Google Scholar
  14. Dan Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. URL: http://dx.doi.org/10.1017/CBO9780511574931.
  15. Juha Kärkkäinen, Giovanni Manzini, and Simon J. Puglisi. Permuted Longest-Common-Prefix Array. In Gregory Kucherov and Esko Ukkonen, editors, Combinatorial Pattern Matching, pages 181-192, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. Google Scholar
  16. Juha Kärkkäinen and Peter Sanders. Simple Linear Work Suffix Array Construction. In Proceedings of the 30th International Conference on Automata, Languages and Programming, ICALP'03, pages 943-955, Berlin, Heidelberg, 2003. Springer-Verlag. Google Scholar
  17. Toru Kasai, Gunho Lee, Hiroki Arimura, Setsuo Arikawa, and Kunsoo Park. Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications. In Amihood Amir, editor, Combinatorial Pattern Matching, pages 181-192, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg. Google Scholar
  18. Donald E. Knuth. The Art of Computer Programming, Volume 3: (2Nd Ed.) Sorting and Searching. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, USA, 1998. Google Scholar
  19. U. Manber and G. Myers. Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing, 22(5):935-948, 1993. URL: http://dx.doi.org/10.1137/0222058.
  20. Giovanni Manzini. Two Space Saving Tricks for Linear Time LCP Array Computation. In Torben Hagerup and Jyrki Katajainen, editors, Algorithm Theory - SWAT 2004, pages 372-383, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. Google Scholar
  21. Ge Nong, Sen Zhang, and Wai Hong Chan. Linear Suffix Array Construction by Almost Pure Induced-Sorting. In James A. Storer and Michael W. Marcellin, editors, DCC, pages 193-202. IEEE Computer Society, 2009. Google Scholar
  22. Simon J. Puglisi, W. F. Smyth, and Andrew H. Turpin. A Taxonomy of Suffix Array Construction Algorithms. ACM Comput. Surv., 39(2), July 2007. URL: http://dx.doi.org/10.1145/1242471.1242472.
  23. Robert Sedgewick. Algorithms in C. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1990. Google Scholar
  24. Wojciech Szpankowski. Average Case Analysis of Algorithms on Sequences. John Wiley &Sons, Inc., New York, NY, USA, 2001. Google Scholar
  25. Brigitte Vallée. Dynamical sources in information theory: Fundamental intervals and word prefixes. Algorithmica, 29(1):262-306, February 2001. URL: http://dx.doi.org/10.1007/BF02679622.
  26. Brigitte Vallée. The Depoissonisation Quintet: Rice-Poisson-Mellin-Newton-Laplace. In Proceedings of the AofA 2018 Conference, volume LIPICS Dagstuhl 110, pages 35:1-35:20, June 2018. (long version: https://arxiv.org/pdf/1802.04988). URL: http://dx.doi.org/10.4230/LIPIcs.AofA.2018.35.
  27. Brigitte Vallée, Julien Clément, James Allen Fill, and Philippe Flajolet. The Number of Symbol Comparisons in QuickSort and QuickSelect. In Automata, Languages and Programming, 36th International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part I, volume 5555 of Lecture Notes in Computer Science, pages 750-763, 2009. URL: http://dx.doi.org/10.1007/978-3-642-02927-1_62.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail