The Maximum Duo-Preservation String Mapping Problem with Bounded Alphabet

Authors Nicolas Boria , Laurent Gourvès, Vangelis Th. Paschos , Jérôme Monnot



PDF
Thumbnail PDF

File

LIPIcs.WABI.2021.5.pdf
  • Filesize: 0.76 MB
  • 12 pages

Document Identifiers

Author Details

Nicolas Boria
  • Université Paris-Dauphine, Université PSL, CNRS, LAMSADE, 75016, Paris, France
Laurent Gourvès
  • Université Paris-Dauphine, Université PSL, CNRS, LAMSADE, 75016, Paris, France
Vangelis Th. Paschos
  • Université Paris-Dauphine, Université PSL, CNRS, LAMSADE, 75016, Paris, France
Jérôme Monnot
  • Université Paris-Dauphine, Université PSL, CNRS, LAMSADE, 75016, Paris, France

Cite AsGet BibTex

Nicolas Boria, Laurent Gourvès, Vangelis Th. Paschos, and Jérôme Monnot. The Maximum Duo-Preservation String Mapping Problem with Bounded Alphabet. In 21st International Workshop on Algorithms in Bioinformatics (WABI 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 201, pp. 5:1-5:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.WABI.2021.5

Abstract

Given two strings A and B such that B is a permutation of A, the max duo-preservation string mapping (MPSM) problem asks to find a mapping π between them so as to preserve a maximum number of duos. A duo is any pair of consecutive characters in a string and it is preserved by π if its two consecutive characters in A are mapped to same two consecutive characters in B. This problem has received a growing attention in recent years, partly as an alternative way to produce approximation algorithms for its minimization counterpart, min common string partition, a widely studied problem due its applications in comparative genomics. Considering this favored field of application with short alphabet, it is surprising that MPSM^𝓁, the variant of MPSM with bounded alphabet, has received so little attention, with a single yet impressive work that provides a 2.67-approximation achieved in O(n) [Brubach, 2018], where n = |A| = |B|. Our work focuses on MPSM^𝓁, and our main contribution is the demonstration that this problem admits a Polynomial Time Approximation Scheme (PTAS) when 𝓁 = O(1). We also provide an alternate, somewhat simpler, proof of NP-hardness for this problem compared with the NP-hardness proof presented in [Haitao Jiang et al., 2012].

Subject Classification

ACM Subject Classification
  • Theory of computation → Approximation algorithms analysis
  • Theory of computation → Dynamic programming
  • Theory of computation → Pattern matching
  • Theory of computation → Complexity classes
Keywords
  • Maximum-Duo Preservation String Mapping
  • Bounded alphabet
  • Polynomial Time Approximation Scheme

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Stefano Beretta, Mauro Castelli, and Riccardo Dondi. Parameterized tractability of the maximum-duo preservation string mapping problem. Theoretical Computer Science, 646:16-25, 2016. Google Scholar
  2. Nicolas Boria, Gianpiero Cabodi, Paolo Camurati, Marco Palena, Paolo Pasini, and Stefano Quer. A 7/2-Approximation Algorithm for the Maximum Duo-Preservation String Mapping Problem. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), pages 11:1-11:8, 2016. Google Scholar
  3. Nicolas Boria, Adam Kurpisz, Samuli Leppänen, and Monaldo Mastrolilli. Improved approximation for the maximum duo-preservation string mapping problem. In International Workshop on Algorithms in Bioinformatics (WABI 2014), pages 14-25. Springer, 2014. Google Scholar
  4. Brian Brubach. Further improvement in approximating the maximum duo-preservation string mapping problem. In International Workshop on Algorithms in Bioinformatics (WABI 2016), pages 52-64. Springer, 2016. Google Scholar
  5. Brian Brubach. Fast matching-based approximations for maximum duo-preservation string mapping and its weighted variant. In Annual Symposium on Combinatorial Pattern Matching (CPM 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Google Scholar
  6. Laurent Bulteau, Guillaume Fertin, Christian Komusiewicz, and Irena Rusu. A fixed-parameter algorithm for minimum common string partition with few duplications. In Algorithms in Bioinformatics - 13th International Workshop, (WABI 2013), pages 244-258, 2013. Google Scholar
  7. Laurent Bulteau and Christian Komusiewicz. Minimum common string partition parameterized by partition size is fixed-parameter tractable. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, (SODA 2014), pages 102-121, 2014. Google Scholar
  8. Wenbin Chen, Zhengzhang Chen, Nagiza F. Samatova, Lingxi Peng, Jianxiong Wang, and Maobin Tang. Solving the maximum duo-preservation string mapping problem with linear programming. Theoretical Computer Science, 530:1-11, 2014. Google Scholar
  9. Xin Chen, Jie Zheng, Zheng Fu, Peng Nan, Yang Zhong, Stefano Lonardi, and Tao Jiang. Assignment of orthologous genes via genome rearrangement. IEEE ACM Trans. Comput. Biol. Bioinform., 2(4):302-315, 2005. Google Scholar
  10. Marek Chrobak, Petr Kolman, and Jirí Sgall. The greedy algorithm for the minimum common string partition problem. In Approximation, Randomization, and Combinatorial Optimization, Algorithms and Techniques, 7th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2004, and 8th International Workshop on Randomization and Computation, RANDOM 2004, Cambridge, MA, USA, August 22-24, 2004, Proceedings, pages 84-95, 2004. Google Scholar
  11. Graham Cormode and S. Muthukrishnan. The string edit distance matching problem with moves. ACM Trans. Algorithms, 3(1):2:1-2:19, 2007. Google Scholar
  12. Peter Damaschke. Minimum common string partition parameterized. In Algorithms in Bioinformatics, 8th International Workshop, WABI 2008, Karlsruhe, Germany, September 15-19, 2008. Proceedings, pages 87-98, 2008. Google Scholar
  13. Bartlomiej Dudek, Pawel Gawrychowski, and Piotr Ostropolski-Nalewaja. A family of approximation algorithms for the maximum duo-preservation string mapping problem. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017. Google Scholar
  14. Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. Google Scholar
  15. Avraham Goldstein, Petr Kolman, and Jie Zheng. Minimum common string partition problem: Hardness and approximations. In Algorithms and Computation, 15th International Symposium, ISAAC 2004, Hong Kong, China, December 20-22, 2004, Proceedings, pages 484-495, 2004. Google Scholar
  16. Haitao Jiang, Binhai Zhu, Daming Zhu, and Hong Zhu. Minimum common string partition revisited. J. Comb. Optim., 23(4):519-527, 2012. Google Scholar
  17. Bruce Johnson. A bioinformatics-inspired adaptation to Ukkonen’s edit distance calculating algorithm and its applicability towards distributed data mining. Journal of Software Engineering and Applications, 1(1):8-12, 2008. Google Scholar
  18. Dan Jurafsky and James H. Martin. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition. Prentice Hall series in artificial intelligence. Prentice Hall, Pearson Education International, 2009. Google Scholar
  19. Petr Kolman and Tomasz Walen. Approximating reversal distance for strings with bounded number of duplicates. Discret. Appl. Math., 155(3):327-336, 2007. Google Scholar
  20. Petr Kolman and Tomasz Walen. Reversal distance for strings with duplicates: Linear time approximation using hitting set. Electron. J. Comb., 14(1), 2007. Google Scholar
  21. Christian Komusiewicz, Mateus de Oliveira Oliveira, and Meirav Zehavi. Revisiting the parameterized complexity of maximum-duo preservation string mapping. Theoretical Computer Science, 847:27-38, 2020. Google Scholar
  22. Saeed Mehrabi. Approximating weighted duo-preservation in comparative genomics. In Yixin Cao and Jianer Chen, editors, Computing and Combinatorics, pages 396-406, Cham, 2017. Springer International Publishing. Google Scholar
  23. Gonzalo Navarro. A guided tour to approximate string matching. ACM computing surveys (CSUR), 33(1):31-88, 2001. Google Scholar
  24. Patsaraporn Somboonsak and Mud-Armeen Munlin. A new edit distance method for finding similarity in Dna sequence. World Academy of Science, Engineering and Technology, International Journal of Biological, Biomolecular, Agricultural, Food and Biotechnological Engineering, 5:622-626, 2011. Google Scholar
  25. Torsten Suel and Nasir Memon. Chapter 13 - algorithms for delta compression and remote file synchronization. In Khalid Sayood, editor, Lossless Compression Handbook, Communications, Networking and Multimedia, pages 269-289. Academic Press, San Diego, 2003. Google Scholar
  26. Krister M. Swenson and Bernard M. E. Moret. Inversion-based genomic signatures. BMC Bioinform., 10(S-1), 2009. Google Scholar
  27. MK Vijaymeena and K Kavitha. A survey on similarity measures in text mining. Machine Learning and Applications: An International Journal (MLAIJ), 3(1), 2016. Google Scholar
  28. William E Winkler. String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. Proceedings of the Section on Survey Research Methods. American Statistical Association, pages 354-359, 1990. Google Scholar
  29. Yao Xu, Yong Chen, Guohui Lin, Tian Liu, Taibo Luo, and Peng Zhang. A (1.4+epsilon)-approximation algorithm for the 2-max-duo problem. In 28th International Symposium on Algorithms and Computation (ISAAC 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2017. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail