On Optimal Balance in B-Trees: What Does It Cost to Stay in Perfect Shape?

Authors Rolf Fagerberg, David Hammer, Ulrich Meyer



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2019.35.pdf
  • Filesize: 0.48 MB
  • 16 pages

Document Identifiers

Author Details

Rolf Fagerberg
  • University of Southern Denmark, Odense, Denmark
David Hammer
  • Goethe University Frankfurt, Germany
  • University of Southern Denmark, Odense, Denmark
Ulrich Meyer
  • Goethe University Frankfurt, Germany

Cite AsGet BibTex

Rolf Fagerberg, David Hammer, and Ulrich Meyer. On Optimal Balance in B-Trees: What Does It Cost to Stay in Perfect Shape?. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ISAAC.2019.35

Abstract

Any B-tree has height at least ceil[log_B(n)]. Static B-trees achieving this height are easy to build. In the dynamic case, however, standard B-tree rebalancing algorithms only maintain a height within a constant factor of this optimum. We investigate exactly how close to ceil[log_B(n)] the height of dynamic B-trees can be maintained as a function of the rebalancing cost. In this paper, we prove a lower bound on the cost of maintaining optimal height ceil[log_B(n)], which shows that this cost must increase from Omega(1/B) to Omega(n/B) rebalancing per update as n grows from one power of B to the next. We also provide an almost matching upper bound, demonstrating this lower bound to be essentially tight. We then give a variant upper bound which can maintain near-optimal height at low cost. As two special cases, we can maintain optimal height for all but a vanishing fraction of values of n using Theta(log_B(n)) amortized rebalancing cost per update and we can maintain a height of optimal plus one using O(1/B) amortized rebalancing cost per update. More generally, for any rebalancing budget, we can maintain (as n grows from one power of B to the next) optimal height essentially up to the point where the lower bound requires the budget to be exceeded, after which optimal height plus one is maintained. Finally, we prove that this balancing scheme gives B-trees with very good storage utilization.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
Keywords
  • B-trees
  • Data structures
  • Lower bounds
  • Complexity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Arne Andersson. Optimal bounds on the dictionary problem. In G. Goos, J. Hartmanis, D. Barstow, W. Brauer, P. Brinch Hansen, D. Gries, D. Luckham, C. Moler, A. Pnueli, G. Seegmüuller, J. Stoer, N. Wirth, and Hristo Djidjev, editors, Optimal Algorithms, volume 401, pages 106-114. Springer Berlin Heidelberg, Berlin, Heidelberg, 1989. URL: https://doi.org/10.1007/3-540-51859-2_10.
  2. Arne Andersson. Efficient Search Trees. PhD Thesis, Department of Computer Science, Lund University, Sweden, 1990. Google Scholar
  3. Arne Andersson, Christian Icking, Rolf Klein, and Thomas Ottmann. Binary search trees of almost optimal height. Acta Informatica, 28(2):165-178, February 1990. URL: https://doi.org/10.1007/BF01237235.
  4. Arne Andersson and Tony W. Lai. Fast updating of well-balanced trees. In John R. Gilbert and Rolf Karlsson, editors, SWAT 90, pages 111-121. Springer Berlin Heidelberg, 1990. Google Scholar
  5. Arne Andersson and Tony W. Lai. Comparison-efficient and write-optimal searching and sorting. In Wen-Lian Hsu and R. C. T. Lee, editors, ISA'91 Algorithms, pages 273-282. Springer Berlin Heidelberg, 1991. Google Scholar
  6. David M. Arnow and Aaron M. Tenenbaum. An empirical comparison of B-trees, compact B-trees and multiway trees. In Proceedings of the 1984 ACM SIGMOD international conference on Management of data, page 33. ACM Press, 1984. URL: https://doi.org/10.1145/602259.602265.
  7. R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173-189, September 1972. URL: https://doi.org/10.1007/BF00288683.
  8. Rudolf Bayer and Karl Unterauer. Prefix B-trees. ACM Transactions on Database Systems, 2(1):11-26, March 1977. URL: https://doi.org/10.1145/320521.320530.
  9. Gerth Stølting Brodal, Rolf Fagerberg, and Riko Jacob. Cache Oblivious Search Trees via Binary Trees of Small Height. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '02, pages 39-48, Philadelphia, PA, USA, 2002. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=545381.545386.
  10. Trevor Brown. B-slack Trees: Space Efficient B-Trees. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Alfred Kobsa, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen, Demetri Terzopoulos, Doug Tygar, Gerhard Weikum, R. Ravi, and Inge Li Gørtz, editors, Algorithm Theory – SWAT 2014, volume 8503, pages 122-133. Springer International Publishing, Cham, 2014. URL: https://doi.org/10.1007/978-3-319-08404-6_11.
  11. Trevor Brown. B-slack trees: Highly Space Efficient B-trees. arXiv:1712.05020 [cs], December 2017. URL: http://arxiv.org/abs/1712.05020.
  12. Douglas Comer. Ubiquitous B-Tree. ACM Comput. Surv., 11(2):121-137, June 1979. URL: https://doi.org/10.1145/356770.356776.
  13. Paul F. Dietz, Joel I. Seiferas, and Ju Zhang. A tight lower bound for on-line monotonic list labeling. In Erik M. Schmidt and Sven Skyum, editors, Algorithm Theory — SWAT '94, pages 131-142. Springer Berlin Heidelberg, 1994. Google Scholar
  14. Rolf Fagerberg. Binary search trees: How low can you go? In G. Goos, J. Hartmanis, J. Leeuwen, Rolf Karlsson, and Andrzej Lingas, editors, Algorithm Theory — SWAT'96, volume 1097, pages 428-439. Springer Berlin Heidelberg, Berlin, Heidelberg, 1996. URL: https://doi.org/10.1007/3-540-61422-2_151.
  15. Rolf Fagerberg. The complexity of rebalancing a binary search tree. In International Conference on Foundations of Software Technology and Theoretical Computer Science, pages 72-83. Springer, 1999. Google Scholar
  16. Scott Huddleston and Kurt Mehlhorn. Robust balancing in B-trees. In Theoretical Computer Science, pages 234-244. Springer, 1981. Google Scholar
  17. Alon Itai, Alan G. Konheim, and Michael Rodeh. A sparse table implementation of priority queues. In Shimon Even and Oded Kariv, editors, Automata, Languages and Programming, pages 417-431. Springer Berlin Heidelberg, 1981. Google Scholar
  18. Donald E. Knuth. The Art of Computer Programming, Volume 3: (2Nd Ed.) Sorting and Searching. Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, USA, 1998. Google Scholar
  19. Klaus Küspert. Storage utilization in B *-trees with a generalized overflow technique. Acta Informatica, 19(1), April 1983. URL: https://doi.org/10.1007/BF00263927.
  20. Tony W. Lai and Derick Wood. Updating almost complete trees or one level makes all the difference. In Christian Choffrut and Thomas Lengauer, editors, STACS 90, pages 188-194. Springer Berlin Heidelberg, 1990. Google Scholar
  21. Tony Wen Hsun Lai. Efficient Maintenance of Binary Search Trees. PhD Thesis, University of Waterloo, Waterloo, Ont., Canada, Canada, 1990. Google Scholar
  22. David Maier and Sharon C. Salveter. Hysterical B-trees. Information Processing Letters, 12(4):199-202, August 1981. URL: https://doi.org/10.1016/0020-0190(81)90101-0.
  23. H.A. Maurer, Th. Ottmann, and H.-W. Six. Implementing dictionaries using binary trees of very small height. Information Processing Letters, 5(1):11-14, May 1976. URL: https://doi.org/10.1016/0020-0190(76)90094-6.
  24. Arnold L. Rosenberg and Lawrence Snyder. Compact B-trees. In Proceedings of the 1979 ACM SIGMOD international conference on Management of data, page 43. ACM Press, 1979. URL: https://doi.org/10.1145/582095.582102.
  25. Balasubramaniam Srinivasan. An Adaptive Overflow Technique to Defer Splitting in B-trees. The Computer Journal, 34(5):397-405, 1991. URL: https://doi.org/10.1093/comjnl/34.5.397.
  26. Andrew Chi-Chih Yao. On random 2–3 trees. Acta Informatica, 9(2):159-170, June 1978. URL: https://doi.org/10.1007/BF00289075.