Formalization of Basic Combinatorics on Words

Authors Štěpán Holub , Štěpán Starosta

Thumbnail PDF


  • Filesize: 1.02 MB
  • 17 pages

Document Identifiers

Author Details

Štěpán Holub
  • Department of Algebra, Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
Štěpán Starosta
  • Dept. of Applied Math., Faculty of Information Technology, Czech Technical University in Prague, Czech Republic


We would like to thank Manuel Eberl for useful suggestions concerning formalization. We are also grateful to anonymous referees whose criticism helped to improve the presentation significantly.

Cite AsGet BibTex

Štěpán Holub and Štěpán Starosta. Formalization of Basic Combinatorics on Words. In 12th International Conference on Interactive Theorem Proving (ITP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 193, pp. 22:1-22:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Combinatorics on Words is a rather young domain encompassing the study of words and formal languages. An archetypal example of a task in Combinatorics on Words is to solve the equation x ⋅ y = y ⋅ x, i.e., to describe words that commute. This contribution contains formalization of three important classical results in Isabelle/HOL. Namely i) the Periodicity Lemma (a.k.a. the theorem of Fine and Wilf), including a construction of a word proving its optimality; ii) the solution of the equation x^a ⋅ y^b = z^c with 2 ≤ a,b,c, known as the Lyndon-Schützenberger Equation; and iii) the Graph Lemma, which yields a generic upper bound on the rank of a solution of a system of equations. The formalization of those results is based on an evolving toolkit of several hundred auxiliary results which provide for smooth reasoning within more complex tasks.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Combinatorics on words
  • combinatorics on words
  • formalization
  • Isabelle/HOL


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Archive of Formal Proofs. Google Scholar
  2. Evelyne Barbin-Le Rest and Michel Le Rest. Sur la combinatoire des codes à deux mots. Theor. Comput. Sci., 41:61-80, 1985. URL:
  3. J Berstel and J Kkarhumäki. Combinatorics on Words endash a tutorial. In Current Trends in Theoretical Computer Science, pages 415-475. World Scientific, April 2004. URL:
  4. J Berstel, D Perrin, J.F Perrot, and A Restivo. Sur le théorème du défaut. Journal of Algebra, 60(1):169-180, 1979. URL:
  5. Jean Berstel and Dominique Perrin. The origins of combinatorics on words. European Journal of Combinatorics, 28(3):996-1022, 2007. URL:
  6. Joachim Breitner. Free groups. Archive of Formal Proofs, 2010. , Formal proof development. URL:
  7. Volker Diekert. Makanin’s algorithm. In Algebraic Combinatorics on Words, Encyclopedia of Mathematics and its Applications, pages 387-–442. Cambridge University Press, 2002. URL:
  8. Pál Dömösi and Géza Horváth. Alternative proof of the Lyndon–Schützenberger theorem. Theoretical Computer Science, 366(3):194-198, 2006. Automata and Formal Languages. URL:
  9. Georges Gonthier et al. A machine-checked proof of the odd order theorem. In ITP, volume 7998 of Lecture Notes in Computer Science, pages 163-179. Springer, 2013. Google Scholar
  10. N. J. Fine and H. S. Wilf. Uniqueness theorems for periodic functions. Proceedings of the American Mathematical Society, 16(1):109-109, January 1965. URL:
  11. Georges Gonthier. Formal proof - the four-color theorem. Notices Amer. Math. Soc., 55(11):1382-1393, 2008. Google Scholar
  12. Thomas Hales et al. A formal proof of the Kepler conjecture. Forum of Mathematics, Pi, 5:e2, 2017. URL:
  13. T. Harju and J. Karhumäki. On the defect theorem and simplifiability. Semigroup Forum, 33:199-217, 1986. Google Scholar
  14. Tero Harju, Juhani Karhumäki, and Wojciech Plandowski. Independent systems of equations. In Algebraic Combinatorics on Words, Encyclopedia of Mathematics and its Applications, pages 443-–471. Cambridge University Press, 2002. URL:
  15. Tero Harju and Dirk Nowotka. On the independence of equations in three variables. Theoretical Computer Science, 307(1):139-172, 2003. WORDS. URL:
  16. Tero Harju and Dirk Nowotka. The equation xⁱ = y^jz^k in a free semigroup. Semigroup Forum, 68(3):488-490, 2004. URL:
  17. John Harrison. Without loss of generality. In Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel, editors, Theorem Proving in Higher Order Logics, pages 43-59, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. Google Scholar
  18. Florent Hivert et al. Coq-Combi., 2021.
  19. Štěpán Holub. Commutation and beyond. In Srečko Brlek, Francesco Dolce, Christophe Reutenauer, and Élise Vandomme, editors, Combinatorics on Words, pages 1-5, Cham, 2017. Springer International Publishing. Google Scholar
  20. Štěpán Holub and Štěpán Starosta. Binary intersection formalized. Theor. Comput. Sci., to appear. Google Scholar
  21. Štěpán Holub and Robert Veroff. Formalizing a fragment of combinatorics on words. In Jarkko Kari, Florin Manea, and Ion Petre, editors, Unveiling Dynamics and Complexity, pages 24-31, Cham, 2017. Springer International Publishing. URL:
  22. Artur Jez. Recompression: A simple and powerful technique for word equations. J. ACM, 63(1):4:1-4:51, 2016. URL:
  23. Artur Jeż. Word equations in nondeterministic linear space. In 44th International Colloquium on Automata, Languages, and Programming, volume 80 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 95, 13. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2017. Google Scholar
  24. Juhani Karhumäki and Wojciech Plandowski. On the size of independent systems of equations in semigroups. Theoretical Computer Science, 168(1):105-119, 1996. URL:
  25. A. Lentin. Equations dans les monoides libres. De Gruyter Mouton, 1972. URL:
  26. M. Lothaire. Combinatorics on words. Cambridge Mathematical Library. Cambridge University Press, Cambridge, 1997. URL:
  27. R. C. Lyndon and M. P. Schützenberger. The equation a^m = b^nc^p in a free group. Michigan Math. J., 9(4):289-298, December 1962. URL:
  28. Gennadiy Semenovich Makanin. The problem of solvability of equations in a free semigroup. Matematicheskii Sbornik, 145(2):147-236, 1977. Google Scholar
  29. Dirk Nowotka and Aleksi Saarela. One-variable word equations and three-variable constant-free word equations. Int. J. Found. Comput. Sci., 29(5):935-950, 2018. URL:
  30. Dirk Nowotka and Aleksi Saarela. An optimal bound on the solution sets of one-variable word equations and its consequences. In 45th International Colloquium on Automata, Languages, and Programming, volume 107 of LIPIcs. Leibniz Int. Proc. Inform., pages Art. No. 136, 13. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, 2018. Google Scholar
  31. Aleksi Saarela. Independent systems of word equations: From Ehrenfeucht to eighteen. In Robert Mercaş and Daniel Reidenbach, editors, Combinatorics on Words, pages 60-67, Cham, 2019. Springer International Publishing. Google Scholar
  32. H.J. Shyr and S.S. Yu. Non-primitive words in the language p^+q^+. Soochow Journal of Mathematics, 20, January 1994. Google Scholar
  33. J.-P. Spehner. Quelques problèmes d'extension, de conjugaison et de presentation des sous-monoïdes d'un monoïde libre. PhD thesis, Université Paris VII, Paris, 1976. Google Scholar
  34. Axel Thue. Über unendliche Zeichenreichen. Skrifter: Matematisk-Naturvidenskapelig Klasse, 1906. Google Scholar
  35. Axel Thue. Uber die gegenseitige lage gleicher teile gewisser zeichenreihen. Kra. Vidensk. Selsk. Skrifer, I. Mat. Nat. Kl., pages 1-67, 1912. Google Scholar
  36. Štěpán Holub, Štěpán Starosta, et al. Combinatorics on words formalized (release v1.3)., 2021.
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail