On the Complexity of the Median and Closest Permutation Problems
Genome rearrangements are events where large blocks of DNA exchange places during evolution. The analysis of these events is a promising tool for understanding evolutionary genomics, providing data for phylogenetic reconstruction based on genome rearrangement measures. Many pairwise rearrangement distances have been proposed, based on finding the minimum number of rearrangement events to transform one genome into the other, using some predefined operation. When more than two genomes are considered, we have the more challenging problem of rearrangement-based phylogeny reconstruction. Given a set of genomes and a distance notion, there are at least two natural ways to define the "target" genome. On the one hand, finding a genome that minimizes the sum of the distances from this to any other, called the median genome. On the other hand, finding a genome that minimizes the maximum distance to any other, called the closest genome. Considering genomes as permutations of distinct integers, some distance metrics have been extensively studied. We investigate the median and closest problems on permutations over the following metrics: breakpoint distance, swap distance, block-interchange distance, short-block-move distance, and transposition distance. In biological applications some values are usually very small, such as the solution value d or the number k of input permutations. For each of these metrics and parameters d or k, we analyze the closest and the median problems from the viewpoint of parameterized complexity. We obtain the following results: NP-hardness for finding the median/closest permutation regarding some metrics of distance, even for only k = 3 permutations; Polynomial kernels for the problems of finding the median permutation of all studied metrics, considering the target distance d as parameter; NP-hardness result for finding the closest permutation by short-block-moves; FPT algorithms and infeasibility of polynomial kernels for finding the closest permutation for some metrics when parameterized by the target distance d.
Median problem
Closest problem
Genome rearrangements
Parameterized complexity
Mathematics of computing~Combinatorics
Theory of computation~Parameterized complexity and exact algorithms
Mathematics of computing~Permutations and combinations
2:1-2:23
Regular Paper
http://arxiv.org/abs/2311.17224
Luís
Cunha
Luís Cunha
Instituto de Computação, Universidade Federal Fluminense, Brasil
http://www.ic.uff.br/~lfignacio
https://orcid.org/0000-0002-3797-6053
FAPERJ-JCNE (E-26/201.372/2022), and CNPq-Universal (406173/2021-4).
Ignasi
Sau
Ignasi Sau
LIRMM, Université de Montpellier, CNRS, France
https://www.lirmm.fr/~sau/
https://orcid.org/0000-0002-8981-9287
project ELIT (ANR-20-CE48-0008-01), and CAPES/PRINT Programa Institucional de Internacionalização, edital nº 41/2017, grant 88887.717401/2022-00.
Uéverton
Souza
Uéverton Souza
Instituto de Computação, Universidade Federal Fluminense, Brasil
http://www.ic.uff.br/~ueverton
https://orcid.org/0000-0002-5320-9209
FAPERJ-JCNE (E-26/201.344/2021), and CNPq (309832/2020-9).
10.4230/LIPIcs.WABI.2024.2
Martin Bader. The transposition median problem is NP-complete. Theor. Comput. Sci., 412(12-14):1099-1110, 2011.
Vineet Bafna and Pavel A Pevzner. Sorting by transpositions. SIAM J. Discrete Math., 11(2):224-240, 1998.
Manu Basavaraju, Fahad Panolan, Ashutosh Rai, MS Ramanujan, and Saket Saurabh. On the kernelization complexity of string problems. Theor. Comput. Sci., 730:21-31, 2018.
Hans L. Bodlaender, Stéphan Thomassé, and Anders Yeo. Kernel bounds for disjoint cycles and disjoint paths. Theor. Comput. Sci., 412(35):4570-4578, 2011.
David Bryant. The complexity of the breakpoint median problem. Centre de recherches mathematiques, Technical Repert, 1998.
Laurent Bulteau, Guillaume Fertin, and Irena Rusu. Sorting by transpositions is difficult. SIAM J. Discrete Math., 26(3):1148-1180, 2012.
Alberto Caprara. Sorting by reversals is difficult. In Proceedings of the first annual international conference on Computational molecular biology, pages 75-83, 1997.
Alberto Caprara. The reversal median problem. INFORMS J. Comput., 15(1):93-113, 2003.
David Alan Christie. Genome Rearrangement Problems. University of Glasgow (United Kingdom), 1998.
Luís Felipe I. Cunha, Pedro Feijão, Vinícius F dos Santos, Luis Antonio B Kowada, and Celina MH de Figueiredo. On the computational complexity of closest genome problems. Discrete Applied Mathematics, 274:26-34, 2020.
Luís Felipe I. Cunha, Luis Antonio B Kowada, Rodrigo de A. Hausen, and Celina MH de Figueiredo. Advancing the transposition distance and diameter through lonely permutations. SIAM J. Discrete Math., 27(4):1682-1709, 2013.
Luís Felipe I. Cunha, Luis Antonio B Kowada, Rodrigo de A. Hausen, and Celina MH de Figueiredo. A faster 1.375-approximation algorithm for sorting by transpositions. In WABI 2014, pages 26-37. Springer Berlin Heidelberg, 2014.
Luís Felipe I. Cunha, Luis Antonio B Kowada, Rodrigo de A Hausen, and Celina MH De Figueiredo. A faster 1.375-approximation algorithm for sorting by transpositions. J. Comput. Biol., 22(11):1044-1056, 2015.
Luís Felipe I. Cunha and Fábio Protti. Genome rearrangements on multigenomic models: Applications of graph convexity problems. J. Comput. Biol., 26(11):1214-1222, 2019.
Marek Cygan, Fedor V Fomin, Łukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michał Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015.
Rodney G Downey and Michael Ralph Fellows. Parameterized Complexity. Springer Science & Business Media, 2012.
Guillaume Fertin, Anthony Labarre, Irena Rusu, Stéphane Vialette, and Eric Tannier. Combinatorics of Genome Rearrangements. MIT press, 2009.
Zheng Fu, Xin Chen, Vladimir Vacic, Peng Nan, Yang Zhong, and Tao Jiang. Msoar: a high-throughput ortholog assignment system based on genome rearrangement. J. Comput. Biol., 14(9):1160-1175, 2007.
Gramm, Niedermeier, and Rossmanith. Fixed-parameter algorithms for closest string and related problems. Algorithmica, 37:25-42, 2003.
Jens Gramm, Rolf Niedermeier, and Peter Rossmanith. Exact solutions for closest string and related problems. In ISAAC 2001, pages 441-453. Springer, 2001.
Maryam Haghighi and David Sankoff. Medians seek the corners, and other conjectures. In BMC bioinformatics, volume 13, pages 1-7. Springer, 2012.
Lenwood S Heath and John Paul C Vergara. Sorting by bounded block-moves. Discrete Appl. Math., 88:181-206, 1998.
Lenwood S Heath and John Paul C Vergara. Sorting by short block-moves. Algorithmica, 28:323-352, 2000.
Ian Holyer. The NP-completeness of some edge-partition problems. SIAM Journal on Computing, 10(4):713-717, 1981.
Gary Hoppenworth, Jason W Bentley, Daniel Gibney, and Sharma V Thankachan. The fine-grained complexity of median and center string problems under edit distance. In 28th Annual European Symposium on Algorithms, ESA 2020, 2020.
D Knuth. The Art of Computer Programming: Sorting and Searching, vol 3, 1998.
Anthony Labarre. Sorting by Prefix Block-Interchanges. In Yixin Cao, Siu-Wing Cheng, and Minming Li, editors, 31st International Symposium on Algorithms and Computation ISAAC 2020, volume 181, pages 55:1-55:15, 2020.
J Kevin Lanctot, Ming Li, Bin Ma, Shaojiu Wang, and Louxin Zhang. Distinguishing string selection problems. Inf. Comput., 185(1):41-55, 2003.
Pavel Pevzner. Computational Molecular Biology: An Algorithmic Approach. MIT press, 2000.
Itsik Pe’er and Ron Shamir. The median problems for breakpoints are NP-complete. In Elec. Colloq. on Comput. Complexity, 1998.
V Yu Popov. Multiple genome rearrangement by swaps and by element duplications. Theor. Comput. Sci., 385(1-3):115-126, 2007.
Andrew J Radcliffe, Alex D Scott, and Elizabeth L Wilmer. Reversals and transpositions over finite alphabets. SIAM J. Discrete Math., 19(1):224-244, 2005.
Geoffrey A Watterson, Warren J Ewens, Thomas Eric Hall, and Alexander Morgan. The chromosome inversion problem. J. Theor. Biol., 99(1):1-7, 1982.
Luís Cunha, Ignasi Sau, and Uéverton Souza
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode