Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time

Authors Hideo Bannai , Juha Kärkkäinen, Dominik Köppl , Marcin Piątkowski



PDF
Thumbnail PDF

File

LIPIcs.CPM.2021.7.pdf
  • Filesize: 0.82 MB
  • 16 pages

Document Identifiers

Author Details

Hideo Bannai
  • M&D Data Science Center, Tokyo Medical and Dental University, Tokyo, Japan
Juha Kärkkäinen
  • Helsinki Institute of Information Technology (HIIT), Finland
Dominik Köppl
  • M&D Data Science Center, Tokyo Medical and Dental University, Tokyo, Japan
Marcin Piątkowski
  • Nicolaus Copernicus University, Toruń, Poland

Cite AsGet BibTex

Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, and Marcin Piątkowski. Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 7:1-7:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.CPM.2021.7

Abstract

The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT (BBWT) is a bijective variant of it. Although it is known that the BWT can be constructed in linear time for integer alphabets by using a linear time suffix array construction algorithm, it was up to now only conjectured that the BBWT can also be constructed in linear time. We confirm this conjecture in the word RAM model by proposing a construction algorithm that is based on SAIS, improving the best known result of O(n lg n / lg lg n) time to linear. Since we can reduce the problem of constructing the extended BWT to constructing the BBWT in linear time, we obtain a linear-time algorithm computing the extended BWT at the same time.

Subject Classification

ACM Subject Classification
  • Theory of computation
  • Mathematics of computing → Combinatorics on words
Keywords
  • Burrows-Wheeler Transform
  • Lyndon words
  • Circular Suffix Array
  • Suffix Array Construction Algorithm

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Donald Adjeroh, Timothy Bell, and Amar Mukherjee. The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer, 2008. Google Scholar
  2. Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, and Marcin Piatkowski. Indexing the bijective BWT. In Proc. CPM, volume 128 of LIPIcs, pages 17:1-17:14, 2019. Google Scholar
  3. Silvia Bonomo, Sabrina Mantaci, Antonio Restivo, Giovanna Rosone, and Marinella Sciortino. Sorting conjugates and suffixes of words in a multiset. Int. J. Found. Comput. Sci., 25(8):1161, 2014. Google Scholar
  4. Michael Burrows and David J. Wheeler. A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation, Palo Alto, California, 1994. Google Scholar
  5. Kuo Tsai Chen, Ralph H. Fox, and Roger C. Lyndon. Free differential calculus, IV. The quotient groups of the lower central series. Annals of Mathematics, pages 81-95, 1958. Google Scholar
  6. Francesco Dolce, Antonio Restivo, and Christophe Reutenauer. On generalized Lyndon words. Theor. Comput. Sci., 777:232-242, 2019. Google Scholar
  7. Jean-Pierre Duval. Factorizing words over an ordered alphabet. J. Algorithms, 4(4):363-381, 1983. Google Scholar
  8. Paolo Ferragina and Giovanni Manzini. Opportunistic data structures with applications. In Proc. FOCS, pages 390-398, 2000. Google Scholar
  9. Paolo Ferragina and Giovanni Manzini. Indexing compressed text. J. ACM, 52(4):552-581, 2005. Google Scholar
  10. Travis Gagie, Gonzalo Navarro, and Nicola Prezza. Optimal-time text indexing in BWT-runs bounded space. In Proc. SODA, pages 1459-1477, 2018. Google Scholar
  11. Daniel Gibney and Sharma V. Thankachan. Finding an optimal alphabet ordering for Lyndon factorization is hard. In Proc. STACS, volume 187 of LIPIcs, pages 35:1-35:15, 2021. Google Scholar
  12. Joseph Yossi Gil and David Allen Scott. A bijective string sorting transform. ArXiv 1201.3077, 2012. URL: http://arxiv.org/abs/1201.3077.
  13. Keisuke Goto. Optimal time and space construction of suffix arrays and LCP arrays for integer alphabets. In Proc. PSC, pages 111-125, 2019. Google Scholar
  14. Wing-Kai Hon, Tsung-Han Ku, Chen-Hua Lu, Rahul Shah, and Sharma V. Thankachan. Efficient algorithm for circular Burrows-Wheeler transform. In Proc. CPM, volume 7354 of LNCS, pages 257-268, 2012. Google Scholar
  15. Wing-Kai Hon, Chen-Hua Lu, Rahul Shah, and Sharma V. Thankachan. Succinct indexes for circular patterns. In Proc. ISAAC, volume 7074 of LNCS, pages 673-682, 2011. Google Scholar
  16. Guy Jacobson. Space-efficient static trees and graphs. In Proc. FOCS, pages 549-554, 1989. Google Scholar
  17. Juha Kärkkäinen, Peter Sanders, and Stefan Burkhardt. Linear work suffix array construction. J. ACM, 53(6):918-936, 2006. Google Scholar
  18. Pang Ko and Srinivas Aluru. Space efficient linear time construction of suffix arrays. J. Discrete Algorithms, 3(2-4):143-156, 2005. Google Scholar
  19. Manfred Kufleitner. On bijective variants of the Burrows-Wheeler transform. In Proc. PSC, pages 65-79, 2009. Google Scholar
  20. Zhize Li, Jian Li, and Hongwei Huo. Optimal in-place suffix sorting. In Proc. SPIRE, volume 11147 of LNCS, pages 268-284, 2018. Google Scholar
  21. R. C. Lyndon. On Burnside’s problem. Transactions of the American Mathematical Society, 77(2):202-215, 1954. Google Scholar
  22. Sabrina Mantaci, Antonio Restivo, Giovanna Rosone, and Marinella Sciortino. An extension of the Burrows-Wheeler transform. Theor. Comput. Sci., 387(3):298-312, 2007. Google Scholar
  23. Ge Nong, Sen Zhang, and Wai Hong Chan. Two efficient algorithms for linear time suffix array construction. IEEE Trans. Computers, 60(10):1471-1484, 2011. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail