What Does Dynamic Optimality Mean in External Memory?

Authors Michael A. Bender, Martín Farach-Colton, William Kuszmaul



PDF
Thumbnail PDF

File

LIPIcs.ITCS.2022.18.pdf
  • Filesize: 0.77 MB
  • 23 pages

Document Identifiers

Author Details

Michael A. Bender
  • Stony Brook University, Stony Brook, NY, USA
Martín Farach-Colton
  • Rutgers University, New Brunswick, NJ, USA
William Kuszmaul
  • MIT, Cambridge, MA, USA

Acknowledgements

Portions of this work were completed at the Second Hawaii Workshop on Parallel Algorithms and Data Structures. The authors would like to thank Nodari Sitchinava for organizing the workshop.

Cite As Get BibTex

Michael A. Bender, Martín Farach-Colton, and William Kuszmaul. What Does Dynamic Optimality Mean in External Memory?. In 13th Innovations in Theoretical Computer Science Conference (ITCS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 215, pp. 18:1-18:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.ITCS.2022.18

Abstract

A data structure A is said to be dynamically optimal over a class of data structures 𝒞 if A is constant-competitive with every data structure C ∈ 𝒞. Much of the research on binary search trees in the past forty years has focused on studying dynamic optimality over the class of binary search trees that are modified via rotations (and indeed, the question of whether splay trees are dynamically optimal has gained notoriety as the so-called dynamic-optimality conjecture). Recently, researchers have extended this to consider dynamic optimality over certain classes of external-memory search trees. In particular, Demaine, Iacono, Koumoutsos, and Langerman propose a class of external-memory trees that support a notion of tree rotations, and then give an elegant data structure, called the Belga B-tree, that is within an O(log log N)-factor of being dynamically optimal over this class.
In this paper, we revisit the question of how dynamic optimality should be defined in external memory. A defining characteristic of external-memory data structures is that there is a stark asymmetry between queries and inserts/updates/deletes: by making the former slightly asymptotically slower, one can make the latter significantly asymptotically faster (even allowing for operations with sub-constant amortized I/Os). This asymmetry makes it so that rotation-based search trees are not optimal (or even close to optimal) in insert/update/delete-heavy external-memory workloads. To study dynamic optimality for such workloads, one must consider a different class of data structures.
The natural class of data structures to consider are what we call buffered-propagation trees. Such trees can adapt dynamically to the locality properties of an input sequence in order to optimize the interactions between different inserts/updates/deletes and queries. We also present a new form of beyond-worst-case analysis that allows for us to formally study a continuum between static and dynamic optimality. Finally, we give a novel data structure, called the Jεllo Tree, that is statically optimal and that achieves dynamic optimality for a large natural class of inputs defined by our beyond-worst-case analysis.

Subject Classification

ACM Subject Classification
  • Theory of computation → Sorting and searching
Keywords
  • Dynamic optimality
  • external memory
  • buffer propagation
  • search trees

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. George M Adel’son-Vel’skii and Evgenii Mikhailovich Landis. An algorithm for organization of information. In Doklady Akademii Nauk, volume 146, pages 263-266. Russian Academy of Sciences, 1962. Google Scholar
  2. Alok Aggarwal and S. Vitter, Jeffrey. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116-1127, September 1988. Google Scholar
  3. Lars Arge. The buffer tree: A new technique for optimal i/o-algorithms. In Workshop on Algorithms and Data structures, pages 334-345. Springer, 1995. Google Scholar
  4. Peter Auer, Nicolo Cesa-Bianchi, Yoav Freund, and Robert E Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of IEEE 36th Annual Foundations of Computer Science, pages 322-331. IEEE, 1995. Google Scholar
  5. Mihai Bădoiu, Richard Cole, Erik D Demaine, and John Iacono. A unified access bound on comparison-based dynamic dictionaries. Theoretical Computer Science, 382(2):86-96, 2007. Google Scholar
  6. Rudolf Bayer. Symmetric binary b-trees: Data structure and maintenance algorithms. Acta informatica, 1(4):290-306, 1972. Google Scholar
  7. Rudolf Bayer and Edward M. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173-189, February 1972. URL: https://doi.org/10.1145/1734663.1734671.
  8. Michael A Bender, Rathish Das, Martín Farach-Colton, Rob Johnson, and William Kuszmaul. Flushing without cascades. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 650-669. SIAM, 2020. Google Scholar
  9. Michael A. Bender, Martin Farach-Colton, Jeremy T. Fineman, Yonatan R. Fogel, Bradley C. Kuszmaul, and Jelani Nelson. Cache-oblivious streaming B-trees. In Proc. 19th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 81-92, 2007. Google Scholar
  10. Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C Kuszmaul, Donald E Porter, Jun Yuan, and Yang Zhan. And introduction to be-trees and write-optimization. Login; Magazine, 40(5), 2015. Google Scholar
  11. Michael A Bender, Martin Farach-Colton, and William Kuszmaul. What does dynamic optimality mean in external memory? arXiv preprint, 2021. Google Scholar
  12. Dhruba Borthakur. Under the hood: Building and open-sourcing rocksdb. Facebook Engineering Notes, 2013. Google Scholar
  13. Presenjit Bose, Karim Douïeb, John Iacono, and Stefan Langerman. The power and limitations of static binary search trees with lazy finger. In International Symposium on Algorithms and Computation, pages 181-192. Springer, 2014. Google Scholar
  14. Prosenjit Bose, Karim Douïeb, and Stefan Langerman. Dynamic optimality for skip lists and b-trees. In Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms, pages 1106-1114. Citeseer, 2008. Google Scholar
  15. Gerth Stølting Brodal and Rolf Fagerberg. Lower bounds for external memory dictionaries. In Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 546-554, 2003. Google Scholar
  16. Adam L Buchsbaum, Michael H Goldwasser, Suresh Venkatasubramanian, and Jeffery R Westbrook. On external memory graph traversal. In SODA, pages 859-860, 2000. Google Scholar
  17. Parinya Chalermsook, Mayank Goswami, László Kozma, Kurt Mehlhorn, and Thatchaphol Saranurak. Multi-finger binary search trees. arXiv preprint arXiv:1809.01759, 2018. Google Scholar
  18. Richard Cole, Bud Mishra, Jeanette Schmidt, and Alan Siegel. On the dynamic finger conjecture for splay trees. part i: Splay sorting log n-block sequences. SIAM Journal on Computing, 30(1):1-43, 2000. Google Scholar
  19. Richard Cole, Bud Mishra, Jeanette Schmidt, and Alan Siegel. On the dynamic finger conjecture for splay trees. part i: Splay sorting log n-block sequences. SIAM Journal on Computing, 30(1):1-43, 2000. Google Scholar
  20. Douglas Comer. The ubiquitous B-tree. ACM Computing Surveys, 11(2):121-137, June 1979. Google Scholar
  21. Alex Conway, Martin Farach-Colton, and Philip Shilane. Optimal hashing in external memory. In 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, page 39. Schloss Dagstuhl-Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2018. Google Scholar
  22. T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein. Introduction To Algorithms. MIT Press, 2001. Google Scholar
  23. Erik D Demaine, Dion Harmon, John Iacono, and Mihai P a ˇ traşcu. Dynamic optimality—almost. SIAM Journal on Computing, 37(1):240-251, 2007. Google Scholar
  24. Erik D Demaine, John Iacono, Grigorios Koumoutsos, and Stefan Langerman. Belga b-trees. In International Computer Science Symposium in Russia, pages 93-105. Springer, 2019. Google Scholar
  25. John Howat, John Iacono, and Pat Morin. The fresh-finger property. arXiv preprint arXiv:1302.6914, 2013. Google Scholar
  26. John Iacono. Alternatives to splay trees with o (log n) worst-case access times. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, pages 516-522. Society for Industrial and Applied Mathematics, 2001. Google Scholar
  27. John Iacono. In pursuit of the dynamic optimality conjecture. In Space-Efficient Data Structures, Streams, and Algorithms, pages 236-250. Springer, 2013. Google Scholar
  28. William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, et al. Betrfs: Write-optimization in a kernel file system. ACM Transactions on Storage (TOS), 11(4):1-29, 2015. Google Scholar
  29. Patrick O'Neil, Edward Cheng, Dieter Gawlic, and Elizabeth O'Neil. The log-structured merge-tree (LSM-tree). Acta Informatica, 33(4):351-385, 1996. URL: https://doi.org/10.1007/s002360050048.
  30. Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In Proceedings of the 26th Symposium on Operating Systems Principles, pages 497-514, 2017. Google Scholar
  31. Murray Sherk. Self-adjusting k-ary search trees. Journal of Algorithms, 19(1):25-44, 1995. Google Scholar
  32. Daniel D Sleator and Robert Endre Tarjan. A data structure for dynamic trees. Journal of computer and system sciences, 26(3):362-391, 1983. Google Scholar
  33. Daniel Dominic Sleator and Robert Endre Tarjan. Self-adjusting binary search trees. Journal of the ACM (JACM), 32(3):652-686, 1985. Google Scholar
  34. Robert Endre Tarjan. Sequential access in splay trees takes linear time. Combinatorica, 5(4):367-378, 1985. Google Scholar
  35. Tokutek, Inc. TokuDBregistered for MySQL Storage Engine, 2009. URL: http://www.tokutek.com.
  36. Tokutek Inc. TokuDB. http://www.tokutek.com/, 2011.
  37. Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Leif Walsh, Michael Bender, et al. Optimizing every operation in a write-optimized file system. In 14th USENIX Conference on File and Storage Technologies (FAST 16), pages 1-14, 2016. Google Scholar
  38. Jun Yuan, Yang Zhan, William Jannen, Prashant Pandey, Amogh Akshintala, Kanchan Chandnani, Pooja Deo, Zardosht Kasheff, Leif Walsh, Michael A Bender, et al. Writes wrought right, and other adventures in file system optimization. ACM Transactions on Storage (TOS), 13(1):1-26, 2017. Google Scholar
  39. Yang Zhan, Alex Conway, Yizheng Jiao, Eric Knorr, Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Donald E Porter, and Jun Yuan. The full path to full-path indexing. In 16th USENIX Conference on File and Storage Technologies (FAST 18), pages 123-138, 2018. Google Scholar
  40. Yang Zhan, Alexander Conway, Yizheng Jiao, Nirjhar Mukherjee, Ian Groombridge, Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Donald E Porter, et al. How to copy files. In 18th USENIX Conference on File and Storage Technologies (FAST 20), pages 75-89, 2020. Google Scholar
  41. Yang Zhan, Yizheng Jiao, Donald E Porter, Alex Conway, Eric Knorr, Martin Farach-Colton, Michael A Bender, Jun Yuan, William Jannen, and Rob Johnson. Efficient directory mutations in a full-path-indexed file system. ACM Transactions on Storage (TOS), 14(3):1-27, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail