Position Heaps for Parameterized Strings

Authors Diptarama Diptarama, Takashi Katsura, Yuhei Otomo, Kazuyuki Narisawa, Ayumi Shinohara



PDF
Thumbnail PDF

File

LIPIcs.CPM.2017.8.pdf
  • Filesize: 0.65 MB
  • 13 pages

Document Identifiers

Author Details

Diptarama Diptarama
Takashi Katsura
Yuhei Otomo
Kazuyuki Narisawa
Ayumi Shinohara

Cite AsGet BibTex

Diptarama Diptarama, Takashi Katsura, Yuhei Otomo, Kazuyuki Narisawa, and Ayumi Shinohara. Position Heaps for Parameterized Strings. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 8:1-8:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.CPM.2017.8

Abstract

We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective mapping from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized position heap of a text and show that our algorithm runs in linear time with respect to the text size. We also show that by using parameterized position heap, we can find all occurrences of a pattern in the text in linear time with respect to the product of the pattern size and the alphabet size.
Keywords
  • string matching
  • indexing structure
  • parameterized pattern matching
  • position heap

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Brenda S. Baker. A program for identifying duplicated code. In H. Joseph Newton, editor, Proceedings of the 24th Symposium on the Interface of Computing Science and Statistics: Graphics and Visualization, volume 24, pages 49-57. Interface Foundation of North America, 1992. URL: http://www.dtic.mil/dtic/tr/fulltext/u2/a266571.pdf.
  2. Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In S. Rao Kosaraju, David S. Johnson, and Alok Aggarwal, editors, Proceedings of the 25th Annual ACM Symposium on Theory of Computing (STOC 1993), pages 71-80. ACM, 1993. URL: http://dx.doi.org/10.1145/167088.167115.
  3. Brenda S. Baker. Parameterized pattern matching: Algorithms and applications. J. Comput. Syst. Sci., 52(1):28-42, 1996. URL: http://dx.doi.org/10.1006/jcss.1996.0003.
  4. Edward G. Coffman Jr. and James Eve. File structures using hashing functions. Commun. ACM, 13(7):427-432, 1970. URL: http://dx.doi.org/10.1145/362686.362693.
  5. Thomas H. Cormen, Charies E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. MIT press, 2009. URL: https://mitpress.mit.edu/books/introduction-algorithms.
  6. Maxime Crochemore and Wojciech Rytter. Jewels of Stringology: Text Algorithms. World Scientific, 2002. URL: http://dx.doi.org/10.1142/9789812778222.
  7. Satoshi Deguchi, Fumihito Higashijima, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda. Parameterized suffix arrays for binary strings. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2008, pages 84-94, Czech Technical University in Prague, Czech Republic, 2008. URL: http://www.stringology.org/event/2008/p08.html.
  8. Andrzej Ehrenfeucht, Ross M. McConnell, Nissa Osheim, and Sung-Whan Woo. Position heaps: A simple and dynamic text indexing data structure. J. Discrete Algorithms, 9(1):100-121, 2011. URL: http://dx.doi.org/10.1016/j.jda.2010.12.001.
  9. Kimmo Fredriksson and Maxim Mozgovoy. Efficient parameterized string matching. Inf. Process. Lett., 100(3):91-96, 2006. URL: http://dx.doi.org/10.1016/j.ipl.2006.06.009.
  10. Travis Gagie, Wing-Kai Hon, and Tsung-Han Ku. New algorithms for position heaps. In Johannes Fischer and Peter Sanders, editors, Proceedings of the 24th Annual Symposium on Combinatorial Pattern Matching (CPM 2013), volume 7922 of LNCS, pages 95-106, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg. URL: http://dx.doi.org/10.1007/978-3-642-38905-4_11.
  11. Dan Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. URL: http://dx.doi.org/10.1017/CBO9780511574931.
  12. Tomohiro I, Satoshi Deguchi, Hideo Bannai, Shunsuke Inenaga, and Masayuki Takeda. Lightweight parameterized suffix array construction. In Jirí Fiala, Jan Kratochvíl, and Mirka Miller, editors, Proceedings of the 20th International Workshop on Combinatorial Algorithms (IWOCA 2009), volume 5874 of LNCS, pages 312-323, Berlin, Heidelberg, 2009. Springer Berlin Heidelberg. URL: http://dx.doi.org/10.1007/978-3-642-10217-2_31.
  13. Gregory Kucherov. On-line construction of position heaps. J. Discrete Algorithms, 20:3-11, 2013. StringMasters 2011 Special Issue. URL: http://dx.doi.org/10.1016/j.jda.2012.08.002.
  14. Yuto Nakashima, Tomohiro I, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. The position heap of a trie. In Liliana Calderón-Benavides, Cristina N. González-Caro, Edgar Chávez, and Nivio Ziviani, editors, Proceedings of the 19th International Symposium on String Processing and Information Retrieval (SPIRE 2012), volume 7608 of LNCS, pages 360-371, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. URL: http://dx.doi.org/10.1007/978-3-642-34109-0_38.
  15. Tetsuo Shibuya. Generalization of a suffix tree for RNA structural pattern matching. Algorithmica, 39(1):1-19, 2004. URL: http://dx.doi.org/10.1007/s00453-003-1067-9.
  16. Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14(3):249-260, 1995. URL: http://dx.doi.org/10.1007/BF01206331.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail