The Longest Filled Common Subsequence Problem

Castelli, Mauro; Dondi, Riccardo; Mauri, Giancarlo; Zoppis, Italo

doi:10.4230/LIPIcs.CPM.2017.14

File

Subject Classification

Keywords

longest common subsequence
approximation algorithms
computational complexity
fixed-parameter algorithms

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

Inspired by a recent approach for genome reconstruction from incomplete data, we consider a variant of the longest common subsequence problem for the comparison of two sequences, one of which is incomplete, i.e. it has some missing elements. The new combinatorial problem, called Longest Filled Common Subsequence, given two sequences A and B, and a multiset M of symbols missing in B, asks for a sequence B* obtained by inserting the symbols of M into B so that B* induces a common subsequence with A of maximum length. First, we investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol. Then, we give a 3/5 approximation algorithm for the problem. Finally, we present a fixed-parameter algorithm, when the problem is parameterized by the number of symbols inserted in B that "match" symbols of A.

Cite As Get BibTex

Mauro Castelli, Riccardo Dondi, Giancarlo Mauri, and Italo Zoppis. The Longest Filled Common Subsequence Problem. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 14:1-14:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.CPM.2017.14

Author Details

Mauro Castelli

Riccardo Dondi

Giancarlo Mauri

Italo Zoppis

References

Said Sadique Adi, Marília D. V. Braga, Cristina G. Fernandes, Carlos Eduardo Ferreira, Fábio Viduani Martinez, Marie-France Sagot, Marco A. Stefanes, Christian Tjandraatmadja, and Yoshiko Wakabayashi. Repetition-free longest common subsequence. Discrete Appl. Math., 158(12):1315-1324, 2010. URL: http://dx.doi.org/10.1016/j.dam.2009.04.023.
Paola Alimonti and Viggo Kann. Some apx-completeness results for cubic graphs. Theor. Comput. Sci., 237(1-2):123-134, 2000. URL: http://dx.doi.org/10.1016/S0304-3975(98)00158-3.
Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. J. ACM, 42(4):844-856, 1995. URL: http://dx.doi.org/10.1145/210332.210337.
Abdullah N. Arslan and Ömer Egecioglu. Algorithms for the constrained longest common subsequence problems. Int. J. Found. Comput. Sci., 16(6):1099-1109, 2005. URL: http://dx.doi.org/10.1142/S0129054105003674.
Giorgio Ausiello, Pierluigi Crescenzi, Giorgio Gambosi, Viggo Kann, Alberto Marchetti-Spaccamela, and Marco Protasi. Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties. Springer-Verlag, Heidelberg, 1999. URL: http://dx.doi.org/10.1007/978-3-642-58412-1.
Guillaume Blin, Paola Bonizzoni, Riccardo Dondi, and Florian Sikora. On the parameterized complexity of the repetition free longest common subsequence problem. Inf. Process. Lett., 112(7):272-276, 2012. URL: http://dx.doi.org/10.1016/j.ipl.2011.12.009.
Paola Bonizzoni, Gianluca Della Vedova, Riccardo Dondi, Guillaume Fertin, Raffaella Rizzi, and Stéphane Vialette. Exemplar longest common subsequence. IEEE/ACM Trans. Comput. Biol. Bioinform., 4(4):535-543, 2007. URL: http://dx.doi.org/10.1145/1322075.1322078.
Paola Bonizzoni, Gianluca Della Vedova, Riccardo Dondi, and Yuri Pirola. Variants of constrained longest common subsequence. Inf. Process. Lett., 110(20):877-881, 2010. URL: http://dx.doi.org/10.1016/j.ipl.2010.07.015.
Laurent Bulteau, Anna Paola Carrieri, and Riccardo Dondi. Fixed-parameter algorithms for scaffold filling. Theor. Comput. Sci., 568:72-83, 2015. URL: http://dx.doi.org/10.1016/j.tcs.2014.12.005.
P. G. S. Chain and et al. Genomics. Genome project standards in a new era of sequencing. Science, 326:236-237, 2009. URL: http://dx.doi.org/10.1126/SCIENCE.1180614.
Francis Y. L. Chin, Alfredo De Santis, Anna Lisa Ferrara, N. L. Ho, and S. K. Kim. A simple algorithm for the constrained sequence problems. Inf. Process. Lett., 90(4):175-179, 2004. URL: http://dx.doi.org/10.1016/j.ipl.2004.02.008.
Carlos Eduardo Ferreira and Christian Tjandraatmadja. A branch-and-cut approach to the repetition-free longest common subsequence problem. Electron. Notes Discrete Math., 36:527-534, 2010. URL: http://dx.doi.org/10.1016/j.endm.2010.05.067.
Zvi Gotthilf, Danny Hermelin, and Moshe Lewenstein. Constrained LCS: hardness and approximation. In Paolo Ferragina and Gad M. Landau, editors, Proceedings of the 19th Annual Symposium on Combinatorial Pattern Matching (CPM 2008), volume 5029 of LNCS, pages 255-262. Springer, 2008. URL: http://dx.doi.org/10.1007/978-3-540-69068-9_24.
Tao Jiang and Ming Li. On the approximation of shortest common supersequences and longest common subsequences. SIAM J. Comput., 24(5):1122-1139, 1995. URL: http://dx.doi.org/10.1137/S009753979223842X.
Nan Liu, Haitao Jiang, Daming Zhu, and Binhai Zhu. An improved approximation algorithm for scaffold filling to maximize the common adjacencies. IEEE/ACM Trans. Comput. Biol. Bioinform., 10(4):905-913, 2013. URL: http://dx.doi.org/10.1109/TCBB.2013.100.
Adriana Muñoz, Chunfang Zheng, Qian Zhu, Victor A. Albert, Steve Rounsley, and David Sankoff. Scaffold filling, contig fusion and comparative gene order inference. BMC Bioinformatics, 11:304, 2010. URL: http://dx.doi.org/10.1186/1471-2105-11-304.
Temple F. Smith and Michael S. Waterman. Identification of common molecular subsequences. J. Mol. Biol., 147(1):195-197, 1981. URL: http://dx.doi.org/10.1016/0022-2836(81)90087-5.
Yin-Te Tsai. The constrained longest common subsequence problem. Inf. Process. Lett., 88(4):173-176, 2003. URL: http://dx.doi.org/10.1016/j.ipl.2003.07.001.
Binhai Zhu. Genomic scaffold filling: A progress report. In Daming Zhu and Sergey Bereg, editors, Proceedings of the 10th International Workshop on Frontiers in Algorithmics (FAW 2016), volume 9711 of LNCS, pages 8-16. Springer, 2016. URL: http://dx.doi.org/10.1007/978-3-319-39817-4_2.

The Longest Filled Common Subsequence Problem

Authors Mauro Castelli, Riccardo Dondi, Giancarlo Mauri, Italo Zoppis

File

Document Identifiers

Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message