eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2018-07-04
39:1
39:14
10.4230/LIPIcs.ICALP.2018.39
article
Optimal Hashing in External Memory
Conway, Alex
1
Farach-Colton, Martín
1
Shilane, Philip
2
Rutgers University, New Brunswick, NJ, USA
Dell EMC, Newtown, PA, USA
Hash tables are a ubiquitous class of dictionary data structures. However, standard hash table implementations do not translate well into the external memory model, because they do not incorporate locality for insertions.
Iacono and Patrasu established an update/query tradeoff curve for external-hash tables: a hash table that performs insertions in O(lambda/B) amortized IOs requires Omega(log_lambda N) expected IOs for queries, where N is the number of items that can be stored in the data structure, B is the size of a memory transfer, M is the size of memory, and lambda is a tuning parameter. They provide a complicated hashing data structure, which we call the IP hash table, that meets this curve for lambda that is Omega(log log M + log_M N).
In this paper, we present a simpler external-memory hash table, the Bundle of Arrays Hash Table (BOA), that is optimal for a narrower range of lambda. The simplicity of BOAs allows them to be readily modified to achieve the following results:
- A new external-memory data structure, the Bundle of Trees Hash Table (BOT), that matches the performance of the IP hash table, while retaining some of the simplicity of the BOAs.
- The Cache-Oblivious Bundle of Trees Hash Table (COBOT), the first cache-oblivious hash table. This data structure matches the optimality of BOTs and IP hash tables over the same range of lambda.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol107-icalp2018/LIPIcs.ICALP.2018.39/LIPIcs.ICALP.2018.39.pdf
hash tables
external memory algorthims
cache-oblivious algorithms
asymmetric data structures