Superstrings with multiplicities

Authors Bastien Cazaux, Eric Rivals

Thumbnail PDF


  • Filesize: 0.71 MB
  • 16 pages

Document Identifiers

Author Details

Bastien Cazaux
  • Department of Computer Science, University of Helsinki, Helsinki, Finland
  • , L.I.R.M.M., Université Montpellier, Montpellier, France & Institute of Computational Biology, Montpellier, France
Eric Rivals
  • L.I.R.M.M., Université Montpellier, Montpellier, France & Institute of Computational Biology, Montpellier, France

Cite AsGet BibTex

Bastien Cazaux and Eric Rivals. Superstrings with multiplicities. In 29th Annual Symposium on Combinatorial Pattern Matching (CPM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 105, pp. 21:1-21:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


A superstring of a set of words P = {s_1, ..., s_p } is a string that contains each word of P as substring. Given P, the well known Shortest Linear Superstring problem (SLS), asks for a shortest superstring of P. In a variant of SLS, called Multi-SLS, each word s_i comes with an integer m(i), its multiplicity, that sets a constraint on its number of occurrences, and the goal is to find a shortest superstring that contains at least m(i) occurrences of s_i. Multi-SLS generalizes SLS and is obviously as hard to solve, but it has been studied only in special cases (with words of length 2 or with a fixed number of words). The approximability of Multi-SLS in the general case remains open. Here, we study the approximability of Multi-SLS and that of the companion problem Multi-SCCS, which asks for a shortest cyclic cover instead of shortest superstring. First, we investigate the approximation of a greedy algorithm for maximizing the compression offered by a superstring or by a cyclic cover: the approximation ratio is 1/2 for Multi-SLS and 1 for Multi-SCCS. Then, we exhibit a linear time approximation algorithm, Concat-Greedy, and show it achieves a ratio of 4 regarding the superstring length. This demonstrates that for both measures Multi-SLS belongs to the class of APX problems.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Discrete mathematics
  • Theory of computation → Approximation algorithms analysis
  • greedy algorithm
  • approximation
  • overlap
  • cyclic cover
  • APX
  • subset system


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Eric L. Anson and Eugene W. Myers. Algorithms for whole genome shotgun sequencing. In Sorin Istrail, Pavel A. Pevzner, and Michael S. Waterman, editors, Proceedings of the Third Annual International Conference on Research in Computational Molecular Biology, RECOMB 1999, Lyon, France, April 11-14, 1999, pages 1-9. ACM, 1999. URL:
  2. Davide Bilò, Hans-Joachim Böckenhauer, Dennis Komm, Richard Královic, Tobias Mömke, Sebastian Seibert, and Anna Zych. Reoptimization of the shortest common superstring problem. Algorithmica, 61(2):227-251, 2011. URL:
  3. Avrim Blum, Tao Jiang, Ming Li, John Tromp, and Mihalis Yannakakis. Linear approximation of shortest superstrings. J. ACM, 41(4):630-647, 1994. Google Scholar
  4. Bastien Cazaux, Samuel Juhel, and Eric Rivals. Practical lower and upper bounds for the shortest linear superstring. In Gianlorenzo D'Angelo, editor, 17th International Symposium on Experimental Algorithms (SEA) 2018, June 27-29, 2018, L'Aquila, Italy, volume 103 of LIPIcs, page in press, 2018. URL:
  5. Bastien Cazaux and Eric Rivals. A linear time algorithm for Shortest Cyclic Cover of Strings. J. Discrete Algorithms, 37:56-67, 2016. Google Scholar
  6. Bastien Cazaux and Eric Rivals. The power of greedy algorithms for approximating Max-ATSP, Cyclic Cover, and superstrings. Discrete Applied Mathematics, 212:48-60, 2016. Google Scholar
  7. Bastien Cazaux and Eric Rivals. Relationship between superstring and compression measures: New insights on the greedy conjecture. Discrete Applied Mathematics, 2017. URL:
  8. Maxime Crochemore, Marek Cygan, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Algorithms for three versions of the shortest common superstring problem. In Amihood Amir and Laxmi Parida, editors, Combinatorial Pattern Matching, 21st Annual Symposium, CPM 2010, New York, NY, USA, June 21-23, 2010. Proceedings, volume 6129 of Lecture Notes in Computer Science, pages 299-309. Springer, 2010. URL:
  9. Gabriele Fici, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. On the greedy algorithm for the Shortest Common Superstring problem with reversals. Inf. Proc. Letters, 116(3):245-251, 2016. Google Scholar
  10. John Gallant, David Maier, and James A. Storer. On finding minimal length superstrings. Journal of Computer and System Sciences, 20:50-58, 1980. Google Scholar
  11. Theodoros P. Gevezes and Leonidas S. Pitsoulis. Optimization in Science and Engineering: In Honor of the 60th Birthday of Panos M. Pardalos, chapter The Shortest Superstring Problem, pages 189-227. Springer New York, New York, NY, 2014. URL:
  12. Tao Jiang, Ming Li, and Ding-Zhu Du. A note on shortest superstrings with flipping. Information Processing Letters, 44(4):195-199, 1992. Google Scholar
  13. Julián Mestre. Greedy in Approximation Algorithms. In Proceedings of 14th Annual European Symposium on Algorithms (ESA), volume 4168 of Lecture Notes in Computer Science, pages 528-539. Springer, 2006. Google Scholar
  14. Gaspard Monge. Mémoire sur la théorie des déblais et des remblais. In Mémoires de l'Académie Royale des Sciences, pages 666-704, 1781. Google Scholar
  15. Jorma Tarhio and Esko Ukkonen. A greedy approximation algorithm for constructing shortest common superstrings. Theor. Comput. Sci., 57:131-145, 1988. URL:
  16. Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249-260, 1995. Google Scholar
  17. Virginia Vassilevska. Explicit inapproximability bounds for the shortest superstring problem. In 30th Int. Symp. on Mathematical Foundations of Computer Science (MFCS), volume 3618 of Lecture Notes in Computer Science, pages 793-800. Springer, 2005. Google Scholar