Document

# The Maximum Duo-Preservation String Mapping Problem with Bounded Alphabet

## File

LIPIcs.WABI.2021.5.pdf
• Filesize: 0.76 MB
• 12 pages

## Cite As

Nicolas Boria, Laurent Gourvès, Vangelis Th. Paschos, and Jérôme Monnot. The Maximum Duo-Preservation String Mapping Problem with Bounded Alphabet. In 21st International Workshop on Algorithms in Bioinformatics (WABI 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 201, pp. 5:1-5:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.WABI.2021.5

## Abstract

Given two strings A and B such that B is a permutation of A, the max duo-preservation string mapping (MPSM) problem asks to find a mapping π between them so as to preserve a maximum number of duos. A duo is any pair of consecutive characters in a string and it is preserved by π if its two consecutive characters in A are mapped to same two consecutive characters in B. This problem has received a growing attention in recent years, partly as an alternative way to produce approximation algorithms for its minimization counterpart, min common string partition, a widely studied problem due its applications in comparative genomics. Considering this favored field of application with short alphabet, it is surprising that MPSM^𝓁, the variant of MPSM with bounded alphabet, has received so little attention, with a single yet impressive work that provides a 2.67-approximation achieved in O(n) [Brubach, 2018], where n = |A| = |B|. Our work focuses on MPSM^𝓁, and our main contribution is the demonstration that this problem admits a Polynomial Time Approximation Scheme (PTAS) when 𝓁 = O(1). We also provide an alternate, somewhat simpler, proof of NP-hardness for this problem compared with the NP-hardness proof presented in [Haitao Jiang et al., 2012].

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Approximation algorithms analysis
• Theory of computation → Dynamic programming
• Theory of computation → Pattern matching
• Theory of computation → Complexity classes
##### Keywords
• Maximum-Duo Preservation String Mapping
• Bounded alphabet
• Polynomial Time Approximation Scheme

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Stefano Beretta, Mauro Castelli, and Riccardo Dondi. Parameterized tractability of the maximum-duo preservation string mapping problem. Theoretical Computer Science, 646:16-25, 2016.
2. Nicolas Boria, Gianpiero Cabodi, Paolo Camurati, Marco Palena, Paolo Pasini, and Stefano Quer. A 7/2-Approximation Algorithm for the Maximum Duo-Preservation String Mapping Problem. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), pages 11:1-11:8, 2016.
3. Nicolas Boria, Adam Kurpisz, Samuli Leppänen, and Monaldo Mastrolilli. Improved approximation for the maximum duo-preservation string mapping problem. In International Workshop on Algorithms in Bioinformatics (WABI 2014), pages 14-25. Springer, 2014.
4. Brian Brubach. Further improvement in approximating the maximum duo-preservation string mapping problem. In International Workshop on Algorithms in Bioinformatics (WABI 2016), pages 52-64. Springer, 2016.
5. Brian Brubach. Fast matching-based approximations for maximum duo-preservation string mapping and its weighted variant. In Annual Symposium on Combinatorial Pattern Matching (CPM 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018.
6. Laurent Bulteau, Guillaume Fertin, Christian Komusiewicz, and Irena Rusu. A fixed-parameter algorithm for minimum common string partition with few duplications. In Algorithms in Bioinformatics - 13th International Workshop, (WABI 2013), pages 244-258, 2013.
7. Laurent Bulteau and Christian Komusiewicz. Minimum common string partition parameterized by partition size is fixed-parameter tractable. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA 2014), pages 102-121, 2014.
8. Wenbin Chen, Zhengzhang Chen, Nagiza F. Samatova, Lingxi Peng, Jianxiong Wang, and Maobin Tang. Solving the maximum duo-preservation string mapping problem with linear programming. Theoretical Computer Science, 530:1-11, 2014.
9. Xin Chen, Jie Zheng, Zheng Fu, Peng Nan, Yang Zhong, Stefano Lonardi, and Tao Jiang. Assignment of orthologous genes via genome rearrangement. IEEE ACM Trans. Comput. Biol. Bioinform., 2(4):302-315, 2005.
10. Marek Chrobak, Petr Kolman, and Jirí Sgall. The greedy algorithm for the minimum common string partition problem. In Approximation, Randomization, and Combinatorial Optimization, Algorithms and Techniques, 7th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2004, and 8th International Workshop on Randomization and Computation, RANDOM 2004, Cambridge, MA, USA, August 22-24, 2004, Proceedings, pages 84-95, 2004.
11. Graham Cormode and S. Muthukrishnan. The string edit distance matching problem with moves. ACM Trans. Algorithms, 3(1):2:1-2:19, 2007.
12. Peter Damaschke. Minimum common string partition parameterized. In Algorithms in Bioinformatics, 8th International Workshop, WABI 2008, Karlsruhe, Germany, September 15-19, 2008. Proceedings, pages 87-98, 2008.
13. Bartlomiej Dudek, Pawel Gawrychowski, and Piotr Ostropolski-Nalewaja. A family of approximation algorithms for the maximum duo-preservation string mapping problem. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.
14. Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.
15. Avraham Goldstein, Petr Kolman, and Jie Zheng. Minimum common string partition problem: Hardness and approximations. In Algorithms and Computation, 15th International Symposium, ISAAC 2004, Hong Kong, China, December 20-22, 2004, Proceedings, pages 484-495, 2004.
16. Haitao Jiang, Binhai Zhu, Daming Zhu, and Hong Zhu. Minimum common string partition revisited. J. Comb. Optim., 23(4):519-527, 2012.
17. Bruce Johnson. A bioinformatics-inspired adaptation to Ukkonen’s edit distance calculating algorithm and its applicability towards distributed data mining. Journal of Software Engineering and Applications, 1(1):8-12, 2008.
18. Dan Jurafsky and James H. Martin. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition. Prentice Hall series in artificial intelligence. Prentice Hall, Pearson Education International, 2009.
19. Petr Kolman and Tomasz Walen. Approximating reversal distance for strings with bounded number of duplicates. Discret. Appl. Math., 155(3):327-336, 2007.
20. Petr Kolman and Tomasz Walen. Reversal distance for strings with duplicates: Linear time approximation using hitting set. Electron. J. Comb., 14(1), 2007.
21. Christian Komusiewicz, Mateus de Oliveira Oliveira, and Meirav Zehavi. Revisiting the parameterized complexity of maximum-duo preservation string mapping. Theoretical Computer Science, 847:27-38, 2020.
22. Saeed Mehrabi. Approximating weighted duo-preservation in comparative genomics. In Yixin Cao and Jianer Chen, editors, Computing and Combinatorics, pages 396-406, Cham, 2017. Springer International Publishing.
23. Gonzalo Navarro. A guided tour to approximate string matching. ACM computing surveys (CSUR), 33(1):31-88, 2001.
24. Patsaraporn Somboonsak and Mud-Armeen Munlin. A new edit distance method for finding similarity in Dna sequence. World Academy of Science, Engineering and Technology, International Journal of Biological, Biomolecular, Agricultural, Food and Biotechnological Engineering, 5:622-626, 2011.
25. Torsten Suel and Nasir Memon. Chapter 13 - algorithms for delta compression and remote file synchronization. In Khalid Sayood, editor, Lossless Compression Handbook, Communications, Networking and Multimedia, pages 269-289. Academic Press, San Diego, 2003.
26. Krister M. Swenson and Bernard M. E. Moret. Inversion-based genomic signatures. BMC Bioinform., 10(S-1), 2009.
27. MK Vijaymeena and K Kavitha. A survey on similarity measures in text mining. Machine Learning and Applications: An International Journal (MLAIJ), 3(1), 2016.
28. William E Winkler. String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. Proceedings of the Section on Survey Research Methods. American Statistical Association, pages 354-359, 1990.
29. Yao Xu, Yong Chen, Guohui Lin, Tian Liu, Taibo Luo, and Peng Zhang. A (1.4+epsilon)-approximation algorithm for the 2-max-duo problem. In 28th International Symposium on Algorithms and Computation (ISAAC 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017.
X

Feedback for Dagstuhl Publishing