A QPTAS for Gapless MEC

Garg, Shilpa; Mömke, Tobias

doi:10.4230/LIPIcs.ESA.2018.34

File

Author Details

Shilpa Garg

Max Planck Institute for Informatics, Saarland Informatics Campus, Germany

Tobias Mömke

University of Bremen and Saarland University, Saarland Informatics Campus, Germany

Cite As Get BibTex

Shilpa Garg and Tobias Mömke. A QPTAS for Gapless MEC. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 34:1-34:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.ESA.2018.34

Abstract

We consider the problem Minimum Error Correction (MEC). A MEC instance is an n x m matrix M with entries from {0,1,-}. Feasible solutions are composed of two binary m-bit strings, together with an assignment of each row of M to one of the two strings. The objective is to minimize the number of mismatches (errors) where the row has a value that differs from the assigned solution string. The symbol "-" is a wildcard that matches both 0 and 1. A MEC instance is gapless, if in each row of M all binary entries are consecutive. Gapless-MEC is a relevant problem in computational biology, and it is closely related to segmentation problems that were introduced by {[}Kleinberg-Papadimitriou-Raghavan STOC'98{]} in the context of data mining. Without restrictions, it is known to be UG-hard to compute an O(1)-approximate solution to MEC. For both MEC and Gapless-MEC, the best polynomial time approximation algorithm has a logarithmic performance guarantee. We partially settle the approximation status of Gapless-MEC by providing a quasi-polynomial time approximation scheme (QPTAS). Additionally, for the relevant case where the binary part of a row is not contained in the binary part of another row, we provide a polynomial time approximation scheme (PTAS).

Subject Classification

ACM Subject Classification

Theory of computation → Approximation algorithms analysis
Theory of computation → Dynamic programming

Keywords

approximation algorithms
QPTAS
minimum error correction
segmentation
computational biology

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Noga Alon and Benny Sudakov. On two segmentation problems. Journal of Algorithms, 33(1):173-184, 1999.
Paola Bonizzoni, Riccardo Dondi, Gunnar W Klau, Yuri Pirola, Nadia Pisanti, and Simone Zaccaria. On the minimum error correction problem for haplotype assembly in diploid and polyploid genomes. Journal of Computational Biology, 2016.
Rudi Cilibrasi, Leo van Iersel, Steven Kelk, and John Tromp. The Complexity of the Single Individual SNP Haplotyping Problem. Algorithmica, 49(1):13-36, aug 2007. URL: http://dx.doi.org/10.1007/s00453-007-0029-z.
Uriel Feige. NP-hardness of hypercube 2-segmentation. CoRR, abs/1411.0821, 2014.
Pierre Fouilhoux and A. Ridha Mahjoub. Solving VLSI design and DNA sequencing problems using bipartization of graphs. Computational Optimization and Applications, 51(2):749-781, 2012. URL: http://dx.doi.org/10.1007/s10589-010-9355-1.
Shilpa Garg, Marcel Martin, and Tobias Marschall. Read-based phasing of related individuals. Bioinformatics, 32(12):i234-i242, 2016.
Dan He, Arthur Choi, Knot Pipatsrisawat, Adnan Darwiche, and Eleazar Eskin. Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics, 26(12):i183-i190, 2010. URL: http://dx.doi.org/10.1093/bioinformatics/btq215.
Yishan Jiao, Jingyi Xu, and Ming Li. On the k-closest substring and k-consensus pattern problems. In CPM, volume 3109 of Lecture Notes in Computer Science, pages 130-144. Springer, 2004.
Jon M. Kleinberg, Christos H. Papadimitriou, and Prabhakar Raghavan. Segmentation problems. In STOC, pages 473-482. ACM, 1998.
Jon M. Kleinberg, Christos H. Papadimitriou, and Prabhakar Raghavan. Segmentation problems. J. ACM, 51(2):263-280, 2004.
Danny Leung, Inkyung Jung, Nisha Rajagopal, Anthony Schmitt, Siddarth Selvaraj, Ah Young Lee, Chia-An Yen, Shin Lin, Yiing Lin, Yunjiang Qiu, et al. Integrative analysis of haplotype-resolved epigenomes across human tissues. Nature, 518(7539):350-354, 2015.
Ming Li, Bin Ma, and Lusheng Wang. Finding similar regions in many sequences. J. Comput. Syst. Sci., 65(1):73-96, 2002.
Rafail Ostrovsky and Yuval Rabani. Polynomial-time approximation schemes for geometric min-sum median clustering. J. ACM, 49(2):139-156, 2002.
Murray Patterson, Tobias Marschall, Nadia Pisanti, Leo van Iersel, Leen Stougie, Gunnar W. Klau, and Alexander Schönhuth. WhatsHap: Weighted haplotype assembly for future-generation sequencing reads. Journal of Computational Biology, 22(6):498-509, feb 2015. URL: http://dx.doi.org/10.1089/cmb.2014.0157.
Yuri Pirola, Simone Zaccaria, Riccardo Dondi, Gunnar W. Klau, Nadia Pisanti, and Paola Bonizzoni. HapCol: accurate and memory-efficient haplotype assembly from long reads. Bioinformatics, page btv495, aug 2015. URL: http://dx.doi.org/10.1093/bioinformatics/btv495.
Jan Remy and Angelika Steger. Approximation schemes for node-weighted geometric steiner tree problems. Algorithmica, 55(1):240-267, 2009.
Matthew W Snyder, Andrew Adey, Jacob O Kitzman, and Jay Shendure. Haplotype-resolved genome sequencing: experimental methods and applications. Nature Reviews Genetics, 16(6):344-358, 2015.
Ryan Tewhey, Vikas Bansal, Ali Torkamani, Eric J Topol, and Nicholas J Schork. The importance of phase information for human genomics. Nature Reviews Genetics, 12(3):215-223, 2011.
Sharon Wulff, Ruth Urner, and Shai Ben-David. Monochromatic bi-clustering. In ICML (2), volume 28 of JMLR Workshop and Conference Proceedings, pages 145-153. JMLR.org, 2013.

A QPTAS for Gapless MEC

Authors Shilpa Garg , Tobias Mömke

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

A QPTAS for Gapless MEC

Authors Shilpa Garg , Tobias Mömke

File

Document Identifiers

Author Details

Funding

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message