A QPTAS for Gapless MEC

Authors Shilpa Garg , Tobias Mömke

Thumbnail PDF


  • Filesize: 494 kB
  • 14 pages

Document Identifiers

Author Details

Shilpa Garg
  • Max Planck Institute for Informatics, Saarland Informatics Campus, Germany
Tobias Mömke
  • University of Bremen and Saarland University, Saarland Informatics Campus, Germany

Cite AsGet BibTex

Shilpa Garg and Tobias Mömke. A QPTAS for Gapless MEC. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 34:1-34:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


We consider the problem Minimum Error Correction (MEC). A MEC instance is an n x m matrix M with entries from {0,1,-}. Feasible solutions are composed of two binary m-bit strings, together with an assignment of each row of M to one of the two strings. The objective is to minimize the number of mismatches (errors) where the row has a value that differs from the assigned solution string. The symbol "-" is a wildcard that matches both 0 and 1. A MEC instance is gapless, if in each row of M all binary entries are consecutive. Gapless-MEC is a relevant problem in computational biology, and it is closely related to segmentation problems that were introduced by {[}Kleinberg-Papadimitriou-Raghavan STOC'98{]} in the context of data mining. Without restrictions, it is known to be UG-hard to compute an O(1)-approximate solution to MEC. For both MEC and Gapless-MEC, the best polynomial time approximation algorithm has a logarithmic performance guarantee. We partially settle the approximation status of Gapless-MEC by providing a quasi-polynomial time approximation scheme (QPTAS). Additionally, for the relevant case where the binary part of a row is not contained in the binary part of another row, we provide a polynomial time approximation scheme (PTAS).

Subject Classification

ACM Subject Classification
  • Theory of computation → Approximation algorithms analysis
  • Theory of computation → Dynamic programming
  • approximation algorithms
  • minimum error correction
  • segmentation
  • computational biology


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Noga Alon and Benny Sudakov. On two segmentation problems. Journal of Algorithms, 33(1):173-184, 1999. Google Scholar
  2. Paola Bonizzoni, Riccardo Dondi, Gunnar W Klau, Yuri Pirola, Nadia Pisanti, and Simone Zaccaria. On the minimum error correction problem for haplotype assembly in diploid and polyploid genomes. Journal of Computational Biology, 2016. Google Scholar
  3. Rudi Cilibrasi, Leo van Iersel, Steven Kelk, and John Tromp. The Complexity of the Single Individual SNP Haplotyping Problem. Algorithmica, 49(1):13-36, aug 2007. URL: http://dx.doi.org/10.1007/s00453-007-0029-z.
  4. Uriel Feige. NP-hardness of hypercube 2-segmentation. CoRR, abs/1411.0821, 2014. Google Scholar
  5. Pierre Fouilhoux and A. Ridha Mahjoub. Solving VLSI design and DNA sequencing problems using bipartization of graphs. Computational Optimization and Applications, 51(2):749-781, 2012. URL: http://dx.doi.org/10.1007/s10589-010-9355-1.
  6. Shilpa Garg, Marcel Martin, and Tobias Marschall. Read-based phasing of related individuals. Bioinformatics, 32(12):i234-i242, 2016. Google Scholar
  7. Dan He, Arthur Choi, Knot Pipatsrisawat, Adnan Darwiche, and Eleazar Eskin. Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics, 26(12):i183-i190, 2010. URL: http://dx.doi.org/10.1093/bioinformatics/btq215.
  8. Yishan Jiao, Jingyi Xu, and Ming Li. On the k-closest substring and k-consensus pattern problems. In CPM, volume 3109 of Lecture Notes in Computer Science, pages 130-144. Springer, 2004. Google Scholar
  9. Jon M. Kleinberg, Christos H. Papadimitriou, and Prabhakar Raghavan. Segmentation problems. In STOC, pages 473-482. ACM, 1998. Google Scholar
  10. Jon M. Kleinberg, Christos H. Papadimitriou, and Prabhakar Raghavan. Segmentation problems. J. ACM, 51(2):263-280, 2004. Google Scholar
  11. Danny Leung, Inkyung Jung, Nisha Rajagopal, Anthony Schmitt, Siddarth Selvaraj, Ah Young Lee, Chia-An Yen, Shin Lin, Yiing Lin, Yunjiang Qiu, et al. Integrative analysis of haplotype-resolved epigenomes across human tissues. Nature, 518(7539):350-354, 2015. Google Scholar
  12. Ming Li, Bin Ma, and Lusheng Wang. Finding similar regions in many sequences. J. Comput. Syst. Sci., 65(1):73-96, 2002. Google Scholar
  13. Rafail Ostrovsky and Yuval Rabani. Polynomial-time approximation schemes for geometric min-sum median clustering. J. ACM, 49(2):139-156, 2002. Google Scholar
  14. Murray Patterson, Tobias Marschall, Nadia Pisanti, Leo van Iersel, Leen Stougie, Gunnar W. Klau, and Alexander Schönhuth. WhatsHap: Weighted haplotype assembly for future-generation sequencing reads. Journal of Computational Biology, 22(6):498-509, feb 2015. URL: http://dx.doi.org/10.1089/cmb.2014.0157.
  15. Yuri Pirola, Simone Zaccaria, Riccardo Dondi, Gunnar W. Klau, Nadia Pisanti, and Paola Bonizzoni. HapCol: accurate and memory-efficient haplotype assembly from long reads. Bioinformatics, page btv495, aug 2015. URL: http://dx.doi.org/10.1093/bioinformatics/btv495.
  16. Jan Remy and Angelika Steger. Approximation schemes for node-weighted geometric steiner tree problems. Algorithmica, 55(1):240-267, 2009. Google Scholar
  17. Matthew W Snyder, Andrew Adey, Jacob O Kitzman, and Jay Shendure. Haplotype-resolved genome sequencing: experimental methods and applications. Nature Reviews Genetics, 16(6):344-358, 2015. Google Scholar
  18. Ryan Tewhey, Vikas Bansal, Ali Torkamani, Eric J Topol, and Nicholas J Schork. The importance of phase information for human genomics. Nature Reviews Genetics, 12(3):215-223, 2011. Google Scholar
  19. Sharon Wulff, Ruth Urner, and Shai Ben-David. Monochromatic bi-clustering. In ICML (2), volume 28 of JMLR Workshop and Conference Proceedings, pages 145-153. JMLR.org, 2013. Google Scholar