Document

**Published in:** LIPIcs, Volume 296, 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024)

In this paper, we study for the first time the Diverse Longest Common Subsequences (LCSs) problem under Hamming distance. Given a set of a constant number of input strings, the problem asks to decide if there exists some subset X of K longest common subsequences whose diversity is no less than a specified threshold Δ, where we consider two types of diversities of a set X of strings of equal length: the Sum diversity and the Min diversity defined as the sum and the minimum of the pairwise Hamming distance between any two strings in X, respectively. We analyze the computational complexity of the respective problems with Sum- and Min-diversity measures, called the Max-Sum and Max-Min Diverse LCSs, respectively, considering both approximation algorithms and parameterized complexity. Our results are summarized as follows. When K is bounded, both problems are polynomial time solvable. In contrast, when K is unbounded, both problems become NP-hard, while Max-Sum Diverse LCSs problem admits a PTAS. Furthermore, we analyze the parameterized complexity of both problems with combinations of parameters K and r, where r is the length of the candidate strings to be selected. Importantly, all positive results above are proven in a more general setting, where an input is an edge-labeled directed acyclic graph (DAG) that succinctly represents a set of strings of the same length. Negative results are proven in the setting where an input is explicitly given as a set of strings. The latter results are equipped with an encoding such a set as the longest common subsequences of a specific input string set.

Yuto Shida, Giulia Punzi, Yasuaki Kobayashi, Takeaki Uno, and Hiroki Arimura. Finding Diverse Strings and Longest Common Subsequences in a Graph. In 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 296, pp. 27:1-27:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{shida_et_al:LIPIcs.CPM.2024.27, author = {Shida, Yuto and Punzi, Giulia and Kobayashi, Yasuaki and Uno, Takeaki and Arimura, Hiroki}, title = {{Finding Diverse Strings and Longest Common Subsequences in a Graph}}, booktitle = {35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024)}, pages = {27:1--27:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-326-3}, ISSN = {1868-8969}, year = {2024}, volume = {296}, editor = {Inenaga, Shunsuke and Puglisi, Simon J.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2024.27}, URN = {urn:nbn:de:0030-drops-201370}, doi = {10.4230/LIPIcs.CPM.2024.27}, annote = {Keywords: Sequence analysis, longest common subsequence, Hamming distance, dispersion, approximation algorithms, parameterized complexity} }

Document

**Published in:** LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)

Park et al. [TCS 2020] observed that the similarity between two (numerical) strings can be captured by the Cartesian trees: The Cartesian tree of a string is a binary tree recursively constructed by picking up the smallest value of the string as the root of the tree. Two strings of equal length are said to Cartesian-tree match if their Cartesian trees are isomorphic. Park et al. [TCS 2020] introduced the following Cartesian tree substring matching (CTMStr) problem: Given a text string T of length n and a pattern string of length m, find every consecutive substring S = T[i..j] of a text string T such that S and P Cartesian-tree match. They showed how to solve this problem in Õ(n+m) time. In this paper, we introduce the Cartesian tree subsequence matching (CTMSeq) problem, that asks to find every minimal substring S = T[i..j] of T such that S contains a subsequence S' which Cartesian-tree matches P. We prove that the CTMSeq problem can be solved efficiently, in O(m n p(n)) time, where p(n) denotes the update/query time for dynamic predecessor queries. By using a suitable dynamic predecessor data structure, we obtain O(mn log log n)-time and O(n log m)-space solution for CTMSeq. This contrasts CTMSeq with closely related order-preserving subsequence matching (OPMSeq) which was shown to be NP-hard by Bose et al. [IPL 1998].

Tsubasa Oizumi, Takeshi Kai, Takuya Mieno, Shunsuke Inenaga, and Hiroki Arimura. Cartesian Tree Subsequence Matching. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 14:1-14:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{oizumi_et_al:LIPIcs.CPM.2022.14, author = {Oizumi, Tsubasa and Kai, Takeshi and Mieno, Takuya and Inenaga, Shunsuke and Arimura, Hiroki}, title = {{Cartesian Tree Subsequence Matching}}, booktitle = {33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)}, pages = {14:1--14:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-234-1}, ISSN = {1868-8969}, year = {2022}, volume = {223}, editor = {Bannai, Hideo and Holub, Jan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.14}, URN = {urn:nbn:de:0030-drops-161414}, doi = {10.4230/LIPIcs.CPM.2022.14}, annote = {Keywords: string algorithms, pattern matching, Cartesian tree subsequence matching, order preserving matching, episode matching} }

Document

**Published in:** LIPIcs, Volume 123, 29th International Symposium on Algorithms and Computation (ISAAC 2018)

A dominating set D of a graph G is a set of vertices such that any vertex in G is in D or its neighbor is in D. Enumeration of minimal dominating sets in a graph is one of central problems in enumeration study since enumeration of minimal dominating sets corresponds to enumeration of minimal hypergraph transversal. However, enumeration of dominating sets including non-minimal ones has not been received much attention. In this paper, we address enumeration problems for dominating sets from sparse graphs which are degenerate graphs and graphs with large girth, and we propose two algorithms for solving the problems. The first algorithm enumerates all the dominating sets for a k-degenerate graph in O(k) time per solution using O(n + m) space, where n and m are respectively the number of vertices and edges in an input graph. That is, the algorithm is optimal for graphs with constant degeneracy such as trees, planar graphs, H-minor free graphs with some fixed H. The second algorithm enumerates all the dominating sets in constant time per solution for input graphs with girth at least nine.

Kazuhiro Kurita, Kunihiro Wasa, Hiroki Arimura, and Takeaki Uno. Efficient Enumeration of Dominating Sets for Sparse Graphs. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 8:1-8:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{kurita_et_al:LIPIcs.ISAAC.2018.8, author = {Kurita, Kazuhiro and Wasa, Kunihiro and Arimura, Hiroki and Uno, Takeaki}, title = {{Efficient Enumeration of Dominating Sets for Sparse Graphs}}, booktitle = {29th International Symposium on Algorithms and Computation (ISAAC 2018)}, pages = {8:1--8:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-094-1}, ISSN = {1868-8969}, year = {2018}, volume = {123}, editor = {Hsu, Wen-Lian and Lee, Der-Tsai and Liao, Chung-Shou}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2018.8}, URN = {urn:nbn:de:0030-drops-99560}, doi = {10.4230/LIPIcs.ISAAC.2018.8}, annote = {Keywords: Enumeration algorithm, polynomial amortized time, dominating set, girth, degeneracy} }

Document

**Published in:** LIPIcs, Volume 54, 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)

We consider fully-online construction of indexing data structures for multiple texts. Let T = {T_1, ..., T_K} be a collection of texts. By fully-online, we mean that a new character can be appended to any text in T at any time. This is a natural generalization of semi-online construction of indexing data structures for multiple texts in which, after a new character is appended to the kth text T_k, then its previous texts T_1, ..., T_k-1 will remain static. Our fully-online scenario arises when we maintain dynamic indexes for multi-sensor data. Let N and sigma denote the total length of texts in T and the alphabet size, respectively. We first show that the algorithm by Blumer et al. [Theoretical Computer Science, 40:31-55, 1985] to construct the directed acyclic word graph (DAWG) for T can readily be extended to our fully-online setting, retaining O(N log sigma)-time and O(N)-space complexities. Then, we give a sophisticated fully-online algorithm which constructs the suffix tree for T in O(N log sigma) time and O(N) space. A key idea of this algorithm is synchronized maintenance of the DAWG and the suffix tree.

Takuya Takagi, Shunsuke Inenaga, and Hiroki Arimura. Fully-online Construction of Suffix Trees for Multiple Texts. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 22:1-22:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{takagi_et_al:LIPIcs.CPM.2016.22, author = {Takagi, Takuya and Inenaga, Shunsuke and Arimura, Hiroki}, title = {{Fully-online Construction of Suffix Trees for Multiple Texts}}, booktitle = {27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016)}, pages = {22:1--22:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-012-5}, ISSN = {1868-8969}, year = {2016}, volume = {54}, editor = {Grossi, Roberto and Lewenstein, Moshe}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2016.22}, URN = {urn:nbn:de:0030-drops-60719}, doi = {10.4230/LIPIcs.CPM.2016.22}, annote = {Keywords: suffix trees, DAWGs, multiple texts, online algorithms} }