The Priority R-Tree: A Practically Efficient and Worst-Case-Optimal R-Tree

Authors Lars Arge, Mark de Berg, Herman J. Haverkort, Ke Yi



PDF
Thumbnail PDF

File

DagSemProc.04301.3.pdf
  • Filesize: 427 kB
  • 26 pages

Document Identifiers

Author Details

Lars Arge
Mark de Berg
Herman J. Haverkort
Ke Yi

Cite As Get BibTex

Lars Arge, Mark de Berg, Herman J. Haverkort, and Ke Yi. The Priority R-Tree: A Practically Efficient and Worst-Case-Optimal R-Tree. In Cache-Oblivious and Cache-Aware Algorithms. Dagstuhl Seminar Proceedings, Volume 4301, pp. 1-26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2005) https://doi.org/10.4230/DagSemProc.04301.3

Abstract

The query efficiency of a data structure that stores a set of objects, can normally be assessed by analysing the number of objects, pointers etc. looked at when answering a query. However, if the data structure is too big to fit in main memory, data may need to be fetched from disk. In that case, the query efficiency is easily dominated by moving the disk head to the correct locations, rather than by reading the data itself.

To reduce the number of disk accesses, once can group the data into blocks, and strive to bound the number of different blocks accessed rather than the number of individual data objects read. An R-tree is a general-purpose data structur that stores a hierarchical grouping of geometric objects into blocks. Many heuristics have been designed to determine which objects should be grouped together, but none of these heuristics could give a guarantee on the resulting worst-case query time.

We present the Priority R-tree, or PR-tree, which is the first R-tree variant that always answers a window query by accessing $O((N/B)^{1-1/d} + T/B)$ blocks, where $N$ is the number of $d$-dimensional objects stored, $B$ is the number of objects per block, and $T$ is the number of objects whose bounding boxes intersect the query window. This is provably asymptotically optimal. Experiments show that the PR-tree performs similar to the best known heuristics on real-life and relatively nicely distributed data, but outperforms them significantly on more extreme data.

Subject Classification

Keywords
  • R-Trees

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail