Maximal Number of Subword Occurrences in a Word

Author Wenjie Fang



PDF
Thumbnail PDF

File

LIPIcs.AofA.2024.3.pdf
  • Filesize: 0.69 MB
  • 12 pages

Document Identifiers

Author Details

Wenjie Fang
  • Univ Gustave Eiffel, CNRS, LIGM, F-77454 Marne-la-Vallée, France

Acknowledgements

I would like to thank Stéphane Vialette for bringing the question of maximal number of subword occurrences of a given word to our attention, and for giving an idea for Proposition 3.4.

Cite AsGet BibTex

Wenjie Fang. Maximal Number of Subword Occurrences in a Word. In 35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 302, pp. 3:1-3:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.AofA.2024.3

Abstract

We consider the number of occurrences of subwords (non-consecutive sub-sequences) in a given word. We first define the notion of subword entropy of a given word that measures the maximal number of occurrences among all possible subwords. We then give upper and lower bounds of minimal subword entropy for words of fixed length in a fixed alphabet, and also showing that minimal subword entropy per letter has a limit value. A better upper bound of minimal subword entropy for a binary alphabet is then given by looking at certain families of periodic words. We also give some conjectures based on experimental observations.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Enumeration
  • Mathematics of computing → Combinatorics on words
Keywords
  • Subword occurrence
  • subword entropy
  • enumeration
  • periodic words

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. A. Burstein, P. Hästö, and T. Mansour. Packing Patterns into Words. Eletron. J. Combin., 9(2), 2003. URL: https://doi.org/10.37236/1692.
  2. A. Burstein and T. Mansour. Counting occurrences of some subword patterns. Discrete Mathematics & Theoretical Computer Science, Vol. 6 no. 1, January 2003. URL: https://doi.org/10.46298/dmtcs.320.
  3. Wenjie Fang. fwjmath/maxocc-subword. Software, swhId: https://archive.softwareheritage.org/swh:1:dir:fef689a6896632f63f67b460e989fc106d5899e0;origin=https://github.com/fwjmath/maxocc-subword;visit=swh:1:snp:93b3836bd2f1078505ef49ee70d7bfaedcbda9cc;anchor=swh:1:rev:82a00ae9fddc73a2a246bfdb1980f1a39c3c8496 (visited on 2024-07-05). URL: https://github.com/fwjmath/maxocc-subword.
  4. M. Fekete. Über die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahligen Koeffizienten. Math. Z., 17(1):228-249, 1923. URL: https://doi.org/10.1007/bf01504345.
  5. Ph. Flajolet and R. Sedgewick. Analytic combinatorics. Cambridge University Press, Cambridge, 2009. URL: https://doi.org/10.1017/CBO9780511801655.
  6. Ph. Flajolet, W. Szpankowski, and B. Vallée. Hidden word statistics. Journal of the ACM, 53(1):147-183, 2006. URL: https://doi.org/10.1145/1120582.1120586.
  7. I. Gheorghiciuc and M. D. Ward. On Correlation Polynomials and Subword Complexity. Discrete Mathematics & Theoretical Computer Science, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 2007), 2007. URL: https://doi.org/10.46298/dmtcs.3553.
  8. K. Iwanuma, R. Ishihara, Y. Takano, and H. Nabeshima. Extracting Frequent Subsequences from a Single Long Data Sequence: A Novel Anti-Monotonic Measure and a Simple On-Line Algorithm. In Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 2005. URL: https://doi.org/10.1109/icdm.2005.60.
  9. S. Kitaev. Patterns in Permutations and Words. Springer Berlin Heidelberg, 2011. URL: https://doi.org/10.1007/978-3-642-17333-2.
  10. S. Melczer. Algorithmic and Symbolic Combinatorics: An Invitation to Analytic Combinatorics in Several Variables. Springer International Publishing, 2021. URL: https://doi.org/10.1007/978-3-030-67080-1.
  11. K. Menon and A. Singh. Subsequence frequency in binary words. Discrete Mathematics, 347(5):113928, May 2024. URL: https://doi.org/10.1016/j.disc.2024.113928.
  12. M. Mishna. Analytic combinatorics: a multidimensional approach. Discrete Mathematics and its Applications (Boca Raton). CRC Press, 2020. Google Scholar
  13. M. Morse and G. A. Hedlund. Symbolic dynamics. Amer. J. Math., 60(4):815, October 1938. URL: https://doi.org/10.2307/2371264.
  14. V. Vatter. Permutation classes. In Handbook of Enumerative Combinatorics. CRC Press, 2015. Google Scholar
  15. G. Yang. The complexity of mining maximal frequent itemsets and maximal frequent patterns. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD04. ACM, August 2004. URL: https://doi.org/10.1145/1014052.1014091.