Search Results

Documents authored by Shinohara, Ayumi


Document
Parallel Algorithm for Pattern Matching Problems Under Substring Consistent Equivalence Relations

Authors: Davaajav Jargalsaikhan, Diptarama Hendrian, Ryo Yoshinaka, and Ayumi Shinohara

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
Given a text and a pattern over an alphabet, the pattern matching problem searches for all occurrences of the pattern in the text. An equivalence relation ≈ is a substring consistent equivalence relation (SCER), if for two strings X and Y, X ≈ Y implies |X| = |Y| and X[i:j] ≈ Y[i:j] for all 1 ≤ i ≤ j ≤ |X|. In this paper, we propose an efficient parallel algorithm for pattern matching under any SCER using the "duel-and-sweep" paradigm. For a pattern of length m and a text of length n, our algorithm runs in O(ξ_m^t log³ m) time and O(ξ_m^w ⋅ n log² m) work, with O(τ_n^t + ξ_m^t log² m) time and O(τ_n^w + ξ_m^w ⋅ m log² m) work preprocessing on the Priority Concurrent Read Concurrent Write Parallel Random-Access Machines (P-CRCW PRAM), where τ_n^t, τ_n^w, ξ_m^t, and ξ_m^w are parameters dependent on SCERs, which are often linear in n and m, respectively.

Cite as

Davaajav Jargalsaikhan, Diptarama Hendrian, Ryo Yoshinaka, and Ayumi Shinohara. Parallel Algorithm for Pattern Matching Problems Under Substring Consistent Equivalence Relations. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 28:1-28:21, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{jargalsaikhan_et_al:LIPIcs.CPM.2022.28,
  author =	{Jargalsaikhan, Davaajav and Hendrian, Diptarama and Yoshinaka, Ryo and Shinohara, Ayumi},
  title =	{{Parallel Algorithm for Pattern Matching Problems Under Substring Consistent Equivalence Relations}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{28:1--28:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.28},
  URN =		{urn:nbn:de:0030-drops-161552},
  doi =		{10.4230/LIPIcs.CPM.2022.28},
  annote =	{Keywords: parallel algorithm, substring consistent equivalence relation, pattern matching}
}
Document
Fast and Linear-Time String Matching Algorithms Based on the Distances of q-Gram Occurrences

Authors: Satoshi Kobayashi, Diptarama Hendrian, Ryo Yoshinaka, and Ayumi Shinohara

Published in: LIPIcs, Volume 160, 18th International Symposium on Experimental Algorithms (SEA 2020)


Abstract
Given a text T of length n and a pattern P of length m, the string matching problem is a task to find all occurrences of P in T. In this study, we propose an algorithm that solves this problem in O((n + m)q) time considering the distance between two adjacent occurrences of the same q-gram contained in P. We also propose a theoretical improvement of it which runs in O(n + m) time, though it is not necessarily faster in practice. We compare the execution times of our and existing algorithms on various kinds of real and artificial datasets such as an English text, a genome sequence and a Fibonacci string. The experimental results show that our algorithm is as fast as the state-of-the-art algorithms in many cases, particularly when a pattern frequently appears in a text.

Cite as

Satoshi Kobayashi, Diptarama Hendrian, Ryo Yoshinaka, and Ayumi Shinohara. Fast and Linear-Time String Matching Algorithms Based on the Distances of q-Gram Occurrences. In 18th International Symposium on Experimental Algorithms (SEA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 160, pp. 13:1-13:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{kobayashi_et_al:LIPIcs.SEA.2020.13,
  author =	{Kobayashi, Satoshi and Hendrian, Diptarama and Yoshinaka, Ryo and Shinohara, Ayumi},
  title =	{{Fast and Linear-Time String Matching Algorithms Based on the Distances of q-Gram Occurrences}},
  booktitle =	{18th International Symposium on Experimental Algorithms (SEA 2020)},
  pages =	{13:1--13:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-148-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{160},
  editor =	{Faro, Simone and Cantone, Domenico},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2020.13},
  URN =		{urn:nbn:de:0030-drops-120878},
  doi =		{10.4230/LIPIcs.SEA.2020.13},
  annote =	{Keywords: String matching algorithm, text processing}
}
Document
Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences

Authors: Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, and Ayumi Shinohara

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
The equidistant subsequence pattern matching problem is considered. Given a pattern string P and a text string T, we say that P is an equidistant subsequence of T if P is a subsequence of the text such that consecutive symbols of P in the occurrence are equally spaced. We can consider the problem of equidistant subsequences as generalizations of (sub-)cadences. We give bit-parallel algorithms that yield o(n²) time algorithms for finding k-(sub-)cadences and equidistant subsequences. Furthermore, O(nlog² n) and O(nlog n) time algorithms, respectively for equidistant and Abelian equidistant matching for the case |P| = 3, are shown. The algorithms make use of a technique that was recently introduced which can efficiently compute convolutions with linear constraints.

Cite as

Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, and Ayumi Shinohara. Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 12:1-12:11, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{funakoshi_et_al:LIPIcs.CPM.2020.12,
  author =	{Funakoshi, Mitsuru and Nakashima, Yuto and Inenaga, Shunsuke and Bannai, Hideo and Takeda, Masayuki and Shinohara, Ayumi},
  title =	{{Detecting k-(Sub-)Cadences and Equidistant Subsequence Occurrences}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{12:1--12:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.12},
  URN =		{urn:nbn:de:0030-drops-121375},
  doi =		{10.4230/LIPIcs.CPM.2020.12},
  annote =	{Keywords: string algorithms, pattern matching, bit parallelism, subsequences, cadences}
}
Document
In-Place Bijective Burrows-Wheeler Transforms

Authors: Dominik Köppl, Daiki Hashimoto, Diptarama Hendrian, and Ayumi Shinohara

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
One of the most well-known variants of the Burrows-Wheeler transform (BWT) [Burrows and Wheeler, 1994] is the bijective BWT (BBWT) [Gil and Scott, arXiv 2012], which applies the extended BWT (EBWT) [Mantaci et al., TCS 2007] to the multiset of Lyndon factors of a given text. Since the EBWT is invertible, the BBWT is a bijective transform in the sense that the inverse image of the EBWT restores this multiset of Lyndon factors such that the original text can be obtained by sorting these factors in non-increasing order. In this paper, we present algorithms constructing or inverting the BBWT in-place using quadratic time. We also present conversions from the BBWT to the BWT, or vice versa, either (a) in-place using quadratic time, or (b) in the run-length compressed setting using 𝒪(n lg r / lg lg r) time with 𝒪(r lg n) bits of words, where r is the sum of character runs in the BWT and the BBWT.

Cite as

Dominik Köppl, Daiki Hashimoto, Diptarama Hendrian, and Ayumi Shinohara. In-Place Bijective Burrows-Wheeler Transforms. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 21:1-21:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{koppl_et_al:LIPIcs.CPM.2020.21,
  author =	{K\"{o}ppl, Dominik and Hashimoto, Daiki and Hendrian, Diptarama and Shinohara, Ayumi},
  title =	{{In-Place Bijective Burrows-Wheeler Transforms}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{21:1--21:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.21},
  URN =		{urn:nbn:de:0030-drops-121463},
  doi =		{10.4230/LIPIcs.CPM.2020.21},
  annote =	{Keywords: In-Place Algorithms, Burrows-Wheeler transform, Lyndon words}
}
Document
DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures

Authors: Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, and Masayuki Takeda

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
Two strings x and y over Σ ∪ Π of equal length are said to parameterized match (p-match) if there is a renaming bijection f:Σ ∪ Π → Σ ∪ Π that is identity on Σ and transforms x to y (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have Θ(n²) nodes and edges but PDAWGs have only O(n) nodes and edges, where n is the length of an input string. We also give O(n |Π| log (|Π| + |Σ|))-time O(n)-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the parameterized suffix tree for the reversed string can also be built in the same time and space, in a right-to-left online manner.

Cite as

Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, and Masayuki Takeda. DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 26:1-26:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{nakashima_et_al:LIPIcs.CPM.2020.26,
  author =	{Nakashima, Katsuhito and Fujisato, Noriki and Hendrian, Diptarama and Nakashima, Yuto and Yoshinaka, Ryo and Inenaga, Shunsuke and Bannai, Hideo and Shinohara, Ayumi and Takeda, Masayuki},
  title =	{{DAWGs for Parameterized Matching: Online Construction and Related Indexing Structures}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{26:1--26:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.26},
  URN =		{urn:nbn:de:0030-drops-121512},
  doi =		{10.4230/LIPIcs.CPM.2020.26},
  annote =	{Keywords: parameterized matching, suffix trees, DAWGs, suffix automata}
}
Document
Position Heaps for Parameterized Strings

Authors: Diptarama Diptarama, Takashi Katsura, Yuhei Otomo, Kazuyuki Narisawa, and Ayumi Shinohara

Published in: LIPIcs, Volume 78, 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)


Abstract
We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective mapping from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized position heap of a text and show that our algorithm runs in linear time with respect to the text size. We also show that by using parameterized position heap, we can find all occurrences of a pattern in the text in linear time with respect to the product of the pattern size and the alphabet size.

Cite as

Diptarama Diptarama, Takashi Katsura, Yuhei Otomo, Kazuyuki Narisawa, and Ayumi Shinohara. Position Heaps for Parameterized Strings. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 8:1-8:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{diptarama_et_al:LIPIcs.CPM.2017.8,
  author =	{Diptarama, Diptarama and Katsura, Takashi and Otomo, Yuhei and Narisawa, Kazuyuki and Shinohara, Ayumi},
  title =	{{Position Heaps for Parameterized Strings}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{8:1--8:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{K\"{a}rkk\"{a}inen, Juha and Radoszewski, Jakub and Rytter, Wojciech},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2017.8},
  URN =		{urn:nbn:de:0030-drops-73396},
  doi =		{10.4230/LIPIcs.CPM.2017.8},
  annote =	{Keywords: string matching, indexing structure, parameterized pattern matching, position heap}
}
Document
An Efficient Algorithm to Test Square-Freeness of Strings Compressed by Balanced Straight Line Program

Authors: Wataru Matsubara, Shunsuke Inenaga, and Ayumi Shinohara

Published in: Dagstuhl Seminar Proceedings, Volume 8261, Structure-Based Compression of Complex Massive Data (2008)


Abstract
In this paper we study the problem of deciding whether a given compressed string contains a square. A string x is called a square if x = zz and z = u^k implies k = 1 and u = z. A string w is said to be square-free if no substrings of w are squares. Many efficient algorithms to test if a given string is square-free, have been developed so far. However, very little is known for testing square-freeness of a given compressed string. In this paper, we give an O(max(n^2; n log^2 N))-time O(n^2)-space solution to test square-freeness of a given compressed string, where n and N are the size of a given compressed string and the corresponding decompressed string, respectively. Our input strings are compressed by balanced straight line program (BSLP). We remark that BSLP has exponential compression, that is, N = O(2^n). Hence no decompress-then-test approaches can be better than our method in the worst case.

Cite as

Wataru Matsubara, Shunsuke Inenaga, and Ayumi Shinohara. An Efficient Algorithm to Test Square-Freeness of Strings Compressed by Balanced Straight Line Program. In Structure-Based Compression of Complex Massive Data. Dagstuhl Seminar Proceedings, Volume 8261, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2008)


Copy BibTex To Clipboard

@InProceedings{matsubara_et_al:DagSemProc.08261.5,
  author =	{Matsubara, Wataru and Inenaga, Shunsuke and Shinohara, Ayumi},
  title =	{{An Efficient Algorithm to Test Square-Freeness of Strings Compressed by Balanced Straight Line Program}},
  booktitle =	{Structure-Based Compression of Complex Massive Data},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8261},
  editor =	{Stefan B\"{o}ttcher and Markus Lohrey and Sebastian Maneth and Wojcieh Rytter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08261.5},
  URN =		{urn:nbn:de:0030-drops-16804},
  doi =		{10.4230/DagSemProc.08261.5},
  annote =	{Keywords: Square Freeness, Straight Line Program}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail