Non-Overlapping Indexing - Cache Obliviously

Authors Sahar Hooshmand, Paniz Abedin, M. Oguzhan Külekci, Sharma V. Thankachan



PDF
Thumbnail PDF

File

LIPIcs.CPM.2018.8.pdf
  • Filesize: 465 kB
  • 9 pages

Document Identifiers

Author Details

Sahar Hooshmand
  • Dept. of Computer Science, University of Central Florida - Orlando, USA
Paniz Abedin
  • Dept. of Computer Science, University of Central Florida - Orlando, USA
M. Oguzhan Külekci
  • Informatics Institute, Istanbul Technical University - Turkey
Sharma V. Thankachan
  • Dept. of Computer Science, University of Central Florida - Orlando, USA

Cite As Get BibTex

Sahar Hooshmand, Paniz Abedin, M. Oguzhan Külekci, and Sharma V. Thankachan. Non-Overlapping Indexing - Cache Obliviously. In 29th Annual Symposium on Combinatorial Pattern Matching (CPM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 105, pp. 8:1-8:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/LIPIcs.CPM.2018.8

Abstract

The non-overlapping indexing problem is defined as follows: pre-process a given text T[1,n] of length n into a data structure such that whenever a pattern P[1,p] comes as an input, we can efficiently report the largest set of non-overlapping occurrences of P in T. The best known solution is by Cohen and Porat [ISAAC, 2009]. Their index size is O(n) words and query time is optimal O(p+nocc), where nocc is the output size. We study this problem in the cache-oblivious model and present a new data structure of size O(n log n) words. It can answer queries in optimal O(p/(B)+log_B n+nocc/B) I/Os, where B is the block size.

Subject Classification

ACM Subject Classification
  • Theory of computation → Pattern matching
Keywords
  • Suffix Trees
  • Cache Oblivious
  • Data Structure
  • String Algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Commun. ACM, 31(9):1116-1127, 1988. URL: http://dx.doi.org/10.1145/48529.48535.
  2. Alberto Apostolico and Franco P Preparata. Data structures and algorithms for the string statistics problem. Algorithmica, 15(5):481-494, 1996. Google Scholar
  3. Michael A. Bender, Erik D. Demaine, and Martin Farach-Colton. Cache-oblivious b-trees. SIAM J. Comput., 35(2):341-358, 2005. URL: http://dx.doi.org/10.1137/S0097539701389956.
  4. Gerth Stølting Brodal and Rolf Fagerberg. Cache-oblivious string dictionaries. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2006, Miami, Florida, USA, January 22-26, 2006, pages 581-590. ACM Press, 2006. URL: http://dl.acm.org/citation.cfm?id=1109557.1109621.
  5. Hagai Cohen and Ely Porat. Range non-overlapping indexing. In Yingfei Dong, Ding-Zhu Du, and Oscar H. Ibarra, editors, Algorithms and Computation, 20th International Symposium, ISAAC 2009, Honolulu, Hawaii, USA, December 16-18, 2009. Proceedings, volume 5878 of Lecture Notes in Computer Science, pages 1044-1053. Springer, 2009. URL: http://dx.doi.org/10.1007/978-3-642-10631-6_105.
  6. Maxime Crochemore. String-matching on ordered alphabets. Theoretical Computer Science, 92(1):33-47, 1992. Google Scholar
  7. Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, pages 285-298. IEEE Computer Society, 1999. URL: http://dx.doi.org/10.1109/SFFCS.1999.814600.
  8. Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. ACM Trans. Algorithms, 8(1):4:1-4:22, 2012. URL: http://dx.doi.org/10.1145/2071379.2071383.
  9. Arnab Ganguly, Rahul Shah, and Sharma V Thankachan. Succinct non-overlapping indexing. In Annual Symposium on Combinatorial Pattern Matching, pages 185-195. Springer, 2015. Google Scholar
  10. Dov Harel and Robert Endre Tarjan. Fast algorithms for finding nearest common ancestors. SIAM J. Comput., 13(2):338-355, 1984. URL: http://dx.doi.org/10.1137/0213024.
  11. Orgad Keller, Tsvi Kopelowitz, and Moshe Lewenstein. Range non-overlapping indexing and successive list indexing. In Frank K. H. A. Dehne, Jörg-Rüdiger Sack, and Norbert Zeh, editors, Algorithms and Data Structures, 10th International Workshop, WADS 2007, Halifax, Canada, August 15-17, 2007, Proceedings, volume 4619 of Lecture Notes in Computer Science, pages 625-636. Springer, 2007. URL: http://dx.doi.org/10.1007/978-3-540-73951-7_54.
  12. Udi Manber and Eugene W. Myers. Suffix arrays: A new method for on-line string searches. SIAM J. Comput., 22(5):935-948, 1993. URL: http://dx.doi.org/10.1137/0222058.
  13. Yakov Nekrich and Gonzalo Navarro. Sorted range reporting. In Fedor V. Fomin and Petteri Kaski, editors, Algorithm Theory - SWAT 2012 - 13th Scandinavian Symposium and Workshops, Helsinki, Finland, July 4-6, 2012. Proceedings, volume 7357 of Lecture Notes in Computer Science, pages 271-282. Springer, 2012. URL: http://dx.doi.org/10.1007/978-3-642-31155-0_24.
  14. Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14(3):249-260, 1995. URL: http://dx.doi.org/10.1007/BF01206331.
  15. Peter Weiner. Linear pattern matching algorithms. In 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, October 15-17, 1973, pages 1-11. IEEE Computer Society, 1973. URL: http://dx.doi.org/10.1109/SWAT.1973.13.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail