On the Complexity of Grammar-Based Compression over Fixed Alphabets

Authors Katrin Casel, Henning Fernau, Serge Gaspers, Benjamin Gras, Markus L. Schmid

Thumbnail PDF


  • Filesize: 0.59 MB
  • 14 pages

Document Identifiers

Author Details

Katrin Casel
Henning Fernau
Serge Gaspers
Benjamin Gras
Markus L. Schmid

Cite AsGet BibTex

Katrin Casel, Henning Fernau, Serge Gaspers, Benjamin Gras, and Markus L. Schmid. On the Complexity of Grammar-Based Compression over Fixed Alphabets. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 122:1-122:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


It is shown that the shortest-grammar problem remains NP-complete if the alphabet is fixed and has a size of at least 24 (which settles an open question). On the other hand, this problem can be solved in polynomial-time, if the number of nonterminals is bounded, which is shown by encoding the problem as a problem on graphs with interval structure. Furthermore, we present an O(3n) exact exponential-time algorithm, based on dynamic programming. Similar results are also given for 1-level grammars, i.e., grammars for which only the start rule contains nonterminals on the right side (thus, investigating the impact of the "hierarchical depth" on the complexity of the shortest-grammar problem).
  • Grammar-Based Compression
  • Straight-Line Programs
  • NP-Completeness
  • Exact Exponential Time Algorithms


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. T. Akutsu. A bisection algorithm for grammar-based compression of ordered trees. Information Processing Letters, 110(18-19):815-820, 2010. Google Scholar
  2. J. Arpe and R. Reischuk. On the complexity of optimal grammar-based compression. In 2006 Data Compression Conference (DCC), pages 173-182. IEEE Computer Society, 2006. Google Scholar
  3. P. Berman, M. Karpinski, L. L. Larmore, W. Plandowski, and W. Rytter. On the complexity of pattern matching for highly compressed two-dimensional texts. Journal of Computer and System Sciences, 65(2):332-350, 2002. Google Scholar
  4. M. Charikar, E. Lehman, D. Liu, R. Panigrahy, M. Prabhakaran, A. Sahai, and A. Shelat. The smallest grammar problem. IEEE Transactions on Information Theory, 51(7):2554-2576, 2005. Google Scholar
  5. R. G. Downey and M. R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013. Google Scholar
  6. M. Farber. Independent domination in chordal graphs. Operations Research Letters, 1(4):134-138, 1982. Google Scholar
  7. M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some simplified NP-complete graph problems. Theoretical Computer Science, 1:237-267, 1976. Google Scholar
  8. M. Jiang and Y. Zhang. Parameterized complexity in multiple-interval graphs: Domination, partition, separation, irredundancy. Theoretical Computer Science, 461:27-44, 2012. Google Scholar
  9. J. C. Kieffer and E.-H. Yang. Grammar-based codes: A new class of universal lossless source codes. IEEE Transactions on Information Theory, 46(3):737-754, 2000. Google Scholar
  10. J. C. Kieffer, E.-H. Yang, G. J. Nelson, and P. C. Cosman. Universal lossless compression via multilevel pattern matching. IEEE Transactions on Information Theory, 46(4):1227-1245, 2000. Google Scholar
  11. M. Lohrey. Algorithmics on SLP-compressed strings: A survey. Groups, Complexity, Cryptology, 4:241-299, 2012. Google Scholar
  12. M. Lohrey and S. Maneth. The complexity of tree automata and XPath on grammar-compressed trees. Theoretical Computer Science, 363(2):196-210, 2006. Google Scholar
  13. M. Lohrey, S. Maneth, and R. Mennicke. XML tree structure compression using RePair. Information Systems, 38(8):1150-1167, 2013. Google Scholar
  14. M. Lohrey, S. Maneth, and M. Schmidt-Schauß. Parameter reduction and automata evaluation for grammar-compressed trees. Journal of Computer and System Sciences, 78(5):1651-1669, 2012. Google Scholar
  15. D. F. Manlove. On the algorithmic complexity of twelve covering and independence parameters of graphs. Discrete Applied Mathematics, 91:155-175, 1999. Google Scholar
  16. C. G. Nevill-Manning. Inferring Sequential Structure. PhD thesis, University of Waikato, NZ, 1996. Google Scholar
  17. C. G. Nevill-Manning and I. H. Witten. Identifying hierarchical structure in sequences: A linear-time algorithm. Journal of Artificial Intelligence Research, 7:67-82, 1997. Google Scholar
  18. W. Rytter. Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoretical Computer Science, 302:211-222, 2003. Google Scholar
  19. C. E. Shannon. A theorem on coloring the lines of a network. J. Math. Physics, 28:148-151, 1949. Google Scholar
  20. S. Skulrattanakulchai. Δ-list vertex coloring in linear time. Information Processing Letters, 98(3):101-106, 2006. Google Scholar
  21. J. A. Storer. NP-completeness results concerning data compression. Technical Report 234, Dept. Electrical Engineering and Computer Science, Princeton University, USA, November 1977. Google Scholar
  22. J. A. Storer and T. G. Szymanski. Data compression via textural substitution. Journal of the ACM, 29(4):928-951, 1982. Google Scholar
  23. E.-H. Yang and J. C. Kieffer. Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform - part one: Without context models. IEEE Transactions on Information Theory, 46(3):755-777, 2000. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail