Linear-time Suffix Sorting - A New Approach for Suffix Array Construction

Author Uwe Baier



PDF
Thumbnail PDF

File

LIPIcs.CPM.2016.23.pdf
  • Filesize: 0.53 MB
  • 12 pages

Document Identifiers

Author Details

Uwe Baier

Cite AsGet BibTex

Uwe Baier. Linear-time Suffix Sorting - A New Approach for Suffix Array Construction. In 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 54, pp. 23:1-23:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.CPM.2016.23

Abstract

This paper presents a new approach for linear-time suffix sorting. It introduces a new sorting principle that can be used to build the first non-recursive linear-time suffix array construction algorithm named GSACA. Although GSACA cannot keep up with the performance of state of the art suffix array construction algorithms, the algorithm introduces a couple of new ideas for suffix array construction, and therefore can be seen as an ’idea collection’ for further suffix array construction improvements.
Keywords
  • Suffix array
  • sorting algorithm
  • linear time

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Uwe Baier. GSACA. https://github.com/waYne1337/gsaca. last visited January 2016.
  2. Uwe Baier. Linear-time Suffix Sorting-A new approach for suffix array construction. Master’s thesis, Ulm University, 2015. Google Scholar
  3. Sebastian Deorowicz. Silesia Corpus. http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia. last visited January 2016.
  4. Paolo Ferragina and Gonzalo Navarro. Pizza &Chili Corpus. http://pizzachili.dcc.uchile.cl/texts.html. last visited January 2016.
  5. Paolo Ferragina and Gonzalo Navarro. Repetitive Corpus. http://pizzachili.dcc.uchile.cl/repcorpus.html. last visited January 2016.
  6. Wing-Kai Hon, Kunihiko Sadakane, and Wing-Kin Sung. Breaking a Time-and-Space Barrier in Constructing Full-Text Indices. In Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, FOCS '03, pages 251-260, 2003. Google Scholar
  7. Juha Kärkkäinen and Peter Sanders. Simple Linear Work Suffix Array Construction. In Proceedings of the 30th International Conference on Automata, Languages and Programming, ICALP '03, pages 943-955, 2003. Google Scholar
  8. Dong Kyue Kim, Jeong Seop Sim, Heejin Park, and Kunsoo Park. Linear-time Construction of Suffix Arrays. In Proceedings of the 14th Annual Conference on Combinatorial Pattern Matching, CPM '03, pages 186-199, 2003. Google Scholar
  9. Pang Ko. Ko-Aluru Algorithm. https://sites.google.com/site/yuta256/KA.tar.bz2. last visited January 2016.
  10. Pang Ko and Srinivas Aluru. Space Efficient Linear Time Construction of Suffix Arrays. In Proceedings of the 14th Annual Conference on Combinatorial Pattern Matching, CPM '03, pages 200-210, 2003. Google Scholar
  11. Udi Manber and Gene Myers. Suffix Arrays: A New Method for On-line String Searches. In Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '90, pages 319-327, 1990. Google Scholar
  12. Yuta Mori. libdivsufsort. https://github.com/y-256/libdivsufsort. last visited January 2016.
  13. Yuta Mori. sais-lite-2.4.1. https://sites.google.com/site/yuta256/sais. last visited January 2016.
  14. Yuta Mori. Suffix Array Construction Benchmark. https://github.com/y-256/libdivsufsort/blob/wiki/SACA_Benchmarks.md. last visited January 2016.
  15. Joong Chae Na. Linear-Time Construction of Compressed Suffix Arrays Using O(N Log N)-bit Working Space for Large Alphabets. In Proceedings of the 16th Annual Conference on Combinatorial Pattern Matching, CPM '05, pages 57-67, 2005. Google Scholar
  16. Ge Nong. Practical Linear-time O(1)-workspace Suffix Sorting for Constant Alphabets. ACM Transactions on Information Systems, 31(3):15:1-15:15, 2013. Google Scholar
  17. Ge Nong, Sen Zhang, and Wai Hong Chan. Linear Suffix Array Construction by Almost Pure Induced-Sorting. In Proceedings of the 2009 Data Compression Conference, DCC '09, pages 193-202, 2009. Google Scholar
  18. Ge Nong, Sen Zhang, and Wai Hong Chan. Linear Time Suffix Array Construction Using D-Critical Substrings. In Proceedings of the 20th Annual Conference on Combinatorial Pattern Matching, CPM '09, pages 54-67, 2009. Google Scholar
  19. Simon J Puglisi, William F Smyth, and Andrew H Turpin. A Taxonomy of Suffix Array Construction Algorithms. ACM Computational Survey, 39(2), 2007. Google Scholar
  20. Peter Sanders. DC3 Algorithm. http://people.mpi-inf.mpg.de/~sanders/programs/suffix/. last visited January 2016.
  21. Peter Weiner. Linear Pattern Matching Algorithms. In Proceedings of the 14th Annual Symposium on Switching and Automata Theory, SWAT '73, pages 1-11, 1973. Google Scholar