Document Open Access Logo

An Improved Data Structure for Left-Right Maximal Generic Words Problem

Authors Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai , Masayuki Takeda



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2019.40.pdf
  • Filesize: 0.69 MB
  • 12 pages

Document Identifiers

Author Details

Yuta Fujishige
  • Department of Informatics, Kyushu University, Japan
Yuto Nakashima
  • Department of Informatics, Kyushu University, Japan
Shunsuke Inenaga
  • Department of Informatics, Kyushu University, Japan
Hideo Bannai
  • Department of Informatics, Kyushu University, Japan
Masayuki Takeda
  • Department of Informatics, Kyushu University, Japan

Cite AsGet BibTex

Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. An Improved Data Structure for Left-Right Maximal Generic Words Problem. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 40:1-40:12, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ISAAC.2019.40

Abstract

For a set D of documents and a positive integer d, a string w is said to be d-left-right maximal, if (1) w occurs in at least d documents in D, and (2) any proper superstring of w occurs in less than d documents. The left-right-maximal generic words problem is, given a set D of documents, to preprocess D so that for any string p and for any positive integer d, all the superstrings of p that are d-left-right maximal can be answered quickly. In this paper, we present an O(n log m) space data structure (in words) which answers queries in O(|p| + o log log m) time, where n is the total length of documents in D, m is the number of documents in D and o is the number of outputs. Our solution improves the previous one by Nishimoto et al. (PSC 2015), which uses an O(n log n) space data structure answering queries in O(|p|+ r * log n + o * log^2 n) time, where r is the number of right-extensions q of p occurring in at least d documents such that any proper right extension of q occurs in less than d documents.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Combinatorial algorithms
  • Mathematics of computing → Combinatorics on words
Keywords
  • generic words
  • suffix trees
  • string processing algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Omer Berkman and Uzi Vishkin. Finding Level-Ancestors in Trees. J. Comput. Syst. Sci., 48(2):214-230, 1994. Google Scholar
  2. Sudip Biswas, Manish Patil, Rahul Shah, and Sharma V. Thankachan. Succinct Indexes for Reporting Discriminating and Generic Words. In Edleno Silva de Moura and Maxime Crochemore, editors, String Processing and Information Retrieval - 21st International Symposium, SPIRE 2014, Ouro Preto, Brazil, October 20-22, 2014. Proceedings, volume 8799 of Lecture Notes in Computer Science, pages 89-100. Springer, 2014. URL: https://doi.org/10.1007/978-3-319-11918-2.
  3. Pawel Gawrychowski, Gregory Kucherov, Yakov Nekrich, and Tatiana A. Starikovskaya. Minimal Discriminating Words Problem Revisited. In Oren Kurland, Moshe Lewenstein, and Ely Porat, editors, String Processing and Information Retrieval - 20th International Symposium, SPIRE 2013, Jerusalem, Israel, October 7-9, 2013, Proceedings, volume 8214 of Lecture Notes in Computer Science, pages 129-140. Springer, 2013. URL: https://doi.org/10.1007/978-3-319-02432-5.
  4. Alexander Golynski, J. Ian Munro, and S. Srinivasa Rao. Rank/select operations on large alphabets: a tool for text indexing. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, Miami, Florida, USA, January 22-26, 2006, pages 368-373. ACM Press, 2006. URL: http://dl.acm.org/citation.cfm?id=1109557.1109599.
  5. Gregory Kucherov, Yakov Nekrich, and Tatiana A. Starikovskaya. Computing Discriminating and Generic Words. In Liliana Calderón-Benavides, Cristina N. González-Caro, Edgar Chávez, and Nivio Ziviani, editors, String Processing and Information Retrieval - 19th International Symposium, SPIRE 2012, Cartagena de Indias, Colombia, October 21-25, 2012. Proceedings, volume 7608 of Lecture Notes in Computer Science, pages 307-317. Springer, 2012. URL: https://doi.org/10.1007/978-3-642-34109-0.
  6. S. Muthukrishnan. Efficient algorithms for document retrieval problems. In David Eppstein, editor, Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 6-8, 2002, San Francisco, CA, USA., pages 657-666. ACM/SIAM, 2002. URL: http://dl.acm.org/citation.cfm?id=545381.545469.
  7. Takaaki Nishimoto, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing Left-Right Maximal Generic Words. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2015, Prague, Czech Republic, August 24-26, 2015, pages 5-16. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2015. URL: http://www.stringology.org/event/2015/p02.html.
  8. Peter Weiner. Linear Pattern Matching Algorithms. In 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15-17, 1973, pages 1-11, 1973. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail