An Improved Data Structure for Left-Right Maximal Generic Words Problem

Fujishige, Yuta; Nakashima, Yuto; Inenaga, Shunsuke; Bannai, Hideo; Takeda, Masayuki

doi:10.4230/LIPIcs.ISAAC.2019.40

Abstract

For a set D of documents and a positive integer d, a string w is said to be d-left-right maximal, if (1) w occurs in at least d documents in D, and (2) any proper superstring of w occurs in less than d documents. The left-right-maximal generic words problem is, given a set D of documents, to preprocess D so that for any string p and for any positive integer d, all the superstrings of p that are d-left-right maximal can be answered quickly. In this paper, we present an O(n log m) space data structure (in words) which answers queries in O(|p| + o log log m) time, where n is the total length of documents in D, m is the number of documents in D and o is the number of outputs. Our solution improves the previous one by Nishimoto et al. (PSC 2015), which uses an O(n log n) space data structure answering queries in O(|p|+ r * log n + o * log^2 n) time, where r is the number of right-extensions q of p occurring in at least d documents such that any proper right extension of q occurs in less than d documents.

Cite As Get BibTex

Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. An Improved Data Structure for Left-Right Maximal Generic Words Problem. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 40:1-40:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.ISAAC.2019.40

Author Details

Yuta Fujishige

Department of Informatics, Kyushu University, Japan

Yuto Nakashima

Department of Informatics, Kyushu University, Japan

Shunsuke Inenaga

Department of Informatics, Kyushu University, Japan

Hideo Bannai

Department of Informatics, Kyushu University, Japan

Masayuki Takeda

Department of Informatics, Kyushu University, Japan

Funding

Nakashima, Yuto: Supported by JSPS KAKENHI Grant Number JP18K18002.
Inenaga, Shunsuke: Supported by JSPS KAKENHI Grant Number JP17H01697.
Bannai, Hideo: Supported by JSPS KAKENHI Grant Number JP16H02783.
Takeda, Masayuki: Supported by JSPS KAKENHI Grant Number JP18H04098.

References

Omer Berkman and Uzi Vishkin. Finding Level-Ancestors in Trees. J. Comput. Syst. Sci., 48(2):214-230, 1994.
Sudip Biswas, Manish Patil, Rahul Shah, and Sharma V. Thankachan. Succinct Indexes for Reporting Discriminating and Generic Words. In Edleno Silva de Moura and Maxime Crochemore, editors, String Processing and Information Retrieval - 21st International Symposium, SPIRE 2014, Ouro Preto, Brazil, October 20-22, 2014. Proceedings, volume 8799 of Lecture Notes in Computer Science, pages 89-100. Springer, 2014. URL: https://doi.org/10.1007/978-3-319-11918-2.
Pawel Gawrychowski, Gregory Kucherov, Yakov Nekrich, and Tatiana A. Starikovskaya. Minimal Discriminating Words Problem Revisited. In Oren Kurland, Moshe Lewenstein, and Ely Porat, editors, String Processing and Information Retrieval - 20th International Symposium, SPIRE 2013, Jerusalem, Israel, October 7-9, 2013, Proceedings, volume 8214 of Lecture Notes in Computer Science, pages 129-140. Springer, 2013. URL: https://doi.org/10.1007/978-3-319-02432-5.
Alexander Golynski, J. Ian Munro, and S. Srinivasa Rao. Rank/select operations on large alphabets: a tool for text indexing. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, Miami, Florida, USA, January 22-26, 2006, pages 368-373. ACM Press, 2006. URL: http://dl.acm.org/citation.cfm?id=1109557.1109599.
Gregory Kucherov, Yakov Nekrich, and Tatiana A. Starikovskaya. Computing Discriminating and Generic Words. In Liliana Calderón-Benavides, Cristina N. González-Caro, Edgar Chávez, and Nivio Ziviani, editors, String Processing and Information Retrieval - 19th International Symposium, SPIRE 2012, Cartagena de Indias, Colombia, October 21-25, 2012. Proceedings, volume 7608 of Lecture Notes in Computer Science, pages 307-317. Springer, 2012. URL: https://doi.org/10.1007/978-3-642-34109-0.
S. Muthukrishnan. Efficient algorithms for document retrieval problems. In David Eppstein, editor, Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 6-8, 2002, San Francisco, CA, USA., pages 657-666. ACM/SIAM, 2002. URL: http://dl.acm.org/citation.cfm?id=545381.545469.
Takaaki Nishimoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing Left-Right Maximal Generic Words. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2015, Prague, Czech Republic, August 24-26, 2015, pages 5-16. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2015. URL: http://www.stringology.org/event/2015/p02.html.
Peter Weiner. Linear Pattern Matching Algorithms. In 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15-17, 1973, pages 1-11, 1973.

An Improved Data Structure for Left-Right Maximal Generic Words Problem

Authors Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai , Masayuki Takeda

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message