eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2020-06-09
25:1
25:12
10.4230/LIPIcs.CPM.2020.25
article
Chaining with Overlaps Revisited
Mäkinen, Veli
1
https://orcid.org/0000-0003-4454-1493
Sahlin, Kristoffer
2
https://orcid.org/0000-0001-7378-2320
Department of Computer Science, University of Helsinki, Finland
Department of Mathematics, Science for Life Laboratory, Stockholm University, Sweden
Chaining algorithms aim to form a semi-global alignment of two sequences based on a set of anchoring local alignments as input. Depending on the optimization criteria and the exact definition of a chain, there are several O(n log n) time algorithms to solve this problem optimally, where n is the number of input anchors.
In this paper, we focus on a formulation allowing the anchors to overlap in a chain. This formulation was studied by Shibuya and Kurochkin (WABI 2003), but their algorithm comes with no proof of correctness. We revisit and modify their algorithm to consider a strict definition of precedence relation on anchors, adding the required derivation to convince on the correctness of the resulting algorithm that runs in O(n log² n) time on anchors formed by exact matches. With the more relaxed definition of precedence relation considered by Shibuya and Kurochkin or when anchors are non-nested such as matches of uniform length (k-mers), the algorithm takes O(n log n) time.
We also establish a connection between chaining with overlaps and the widely studied longest common subsequence problem.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol161-cpm2020/LIPIcs.CPM.2020.25/LIPIcs.CPM.2020.25.pdf
Sparse Dynamic Programming
Chaining
Maximal Exact Matches
Longest Common Subsequence