An Efficient Vectorized Hash Table for Batch Computations

Authors Hesam Shahrokhi , Amir Shaikhha

Thumbnail PDF


  • Filesize: 1.33 MB
  • 27 pages

Document Identifiers

Author Details

Hesam Shahrokhi
  • University of Edinburgh, UK
Amir Shaikhha
  • University of Edinburgh, UK


The authors would like to thank Huawei for their support of the distributed data management and processing laboratory at the University of Edinburgh.

Cite AsGet BibTex

Hesam Shahrokhi and Amir Shaikhha. An Efficient Vectorized Hash Table for Batch Computations. In 37th European Conference on Object-Oriented Programming (ECOOP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 263, pp. 27:1-27:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


In recent years, the increasing demand for high-performance analytics on big data has led the research on batch hash tables. It is shown that this type of hash table can benefit from the cache locality and multi-threading more than ordinary hash tables. Moreover, the batch design for hash tables is amenable to using advanced features of modern processors such as prefetching and SIMD vectorization. While state-of-the-art research and open-source projects on batch hash tables made efforts to propose improved designs by better usage of mentioned hardware features, their approaches still do not fully exploit the existing opportunities for performance improvements. Furthermore, there is a gap for a high-level batch API of such hash tables for wider adoption of these high-performance data structures. In this paper, we present Vec-HT, a parallel, SIMD-vectorized, and prefetching-enabled hash table for fast batch processing. To allow developers to fully take advantage of its performance, we recommend a high-level batch API design. Our experimental results show the superiority and competitiveness of this approach in comparison with the alternative implementations and state-of-the-art for the data-intensive workloads of relational join processing, set operations, and sparse vector processing.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
  • Computer systems organization → Single instruction, multiple data
  • Hash tables
  • Vectorization
  • Parallelization
  • Prefetching


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Dpdk.
  2. Highwayhash. URL:
  3. Hirola.
  4. Hyper.
  5. The parallel hashmap.
  6. R hashmap.
  7. Threading building blocks (tbb).
  8. TPC-H Benchmark .
  9. Alex D Breslow, Dong Ping Zhang, Joseph L Greathouse, Nuwan Jayasena, and Dean M Tullsen. Horton tables: Fast hash tables for In-MemoryData-Intensive computing. In 2016 USENIX Annual Technical Conference (USENIX ATC 16), pages 281-294, 2016. Google Scholar
  10. Pedro Celis, Per-Ake Larson, and J Ian Munro. Robin hood hashing. In 26th Annual Symposium on Foundations of Computer Science (sfcs 1985), pages 281-288. IEEE, 1985. Google Scholar
  11. Shimin Chen, Anastassia Ailamaki, Phillip B Gibbons, and Todd C Mowry. Improving hash join performance through prefetching. ACM Transactions on Database Systems (TODS), 32(3):17-es, 2007. Google Scholar
  12. Bin Fan, David G Andersen, and Michael Kaminsky. MemC3: Compact and concurrent MemCache with dumber caching and smarter hashing. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 371-384, 2013. Google Scholar
  13. Xiaozhou Li, David G Andersen, Michael Kaminsky, and Michael J Freedman. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the Ninth European Conference on Computer Systems, EuroSys'14, pages 1-14, 2014. Google Scholar
  14. Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. Silt: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pages 1-13, 2011. Google Scholar
  15. Tobias Maier, Peter Sanders, and Roman Dementiev. Concurrent hash tables: Fast and general (?)! ACM Transactions on Parallel Computing (TOPC), 5(4):1-32, 2019. Google Scholar
  16. Prashanth Menon, Todd C Mowry, and Andrew Pavlo. Relaxed operator fusion for in-memory databases: Making compilation, vectorization, and prefetching work together at last. Proceedings of the VLDB Endowment, 11(1):1-13, 2017. Google Scholar
  17. Thomas Neumann. Efficiently compiling efficient query plans for modern hardware. Proceedings of the VLDB Endowment, 4(9):539-550, 2011. Google Scholar
  18. Rasmus Pagh and Flemming Friche Rodler. Cuckoo hashing. Journal of Algorithms, 51(2):122-144, 2004. Google Scholar
  19. Orestis Polychroniou, Arun Raghavan, and Kenneth A Ross. Rethinking simd vectorization for in-memory databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1493-1508, 2015. Google Scholar
  20. Mihai Pǎtraşcu and Mikkel Thorup. The power of simple tabulation hashing. Journal of the ACM (JACM), 59(3):1-50, 2012. Google Scholar
  21. Stefan Richter, Victor Alvarez, and Jens Dittrich. A seven-dimensional analysis of hashing methods and its implications on query processing. PVLDB, 9(3):96-107, 2015. Google Scholar
  22. Kenneth A Ross. Efficient hash probes on modern processors. In 2007 IEEE 23rd International Conference on Data Engineering, pages 1297-1301. IEEE, 2007. Google Scholar
  23. Nicolas Le Scouarnec. Cuckoo++ hash tables: High-performance hash tables for networking applications. In Proceedings of the 2018 Symposium on Architectures for Networking and Communications Systems, pages 41-54, 2018. Google Scholar
  24. Hesam Shahrokhi and Amir Shaikhha. Building a compiled query engine in python. In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction, CC 2023, pages 180-190, 2023. Google Scholar
  25. Amir Shaikhha, Mohammad Dashti, and Christoph Koch. Push versus pull-based loop fusion in query engines. Journal of Functional Programming, 28:e10, 2018. Google Scholar
  26. Amir Shaikhha, Andrew Fitzgibbon, Simon Peyton Jones, and Dimitrios Vytiniotis. Destination-passing style for efficient memory management. In Proceedings of the 6th ACM SIGPLAN International Workshop on Functional High-Performance Computing, pages 12-23, 2017. Google Scholar
  27. Amir Shaikhha, Mahdi Ghorbani, and Hesam Shahrokhi. Hinted dictionaries: Efficient functional ordered sets and maps. In 36th European Conference on Object-Oriented Programming (ECOOP 2022). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. Google Scholar
  28. Amir Shaikhha, Mathieu Huot, Jaclyn Smith, and Dan Olteanu. Functional collection programming with semi-ring dictionaries. Proc. ACM Program. Lang., 6(OOPSLA1):1-33, 2022. URL:
  29. Amir Shaikhha, Yannis Klonatos, Lionel Parreaux, Lewis Brown, Mohammad Dashti, and Christoph Koch. How to architect a query compiler. In Proceedings of the 2016 International Conference on Management of Data, pages 1907-1922, 2016. Google Scholar
  30. Dipti Shankar, Xiaoyi Lu, and Dhabaleswar K DK Panda. Simdht-bench: characterizing simd-aware hash table designs on emerging cpu architectures. In 2019 IEEE International Symposium on Workload Characterization (IISWC), pages 178-188. IEEE, 2019. Google Scholar
  31. Michael Vollmer, Chaitanya Koparkar, Mike Rainey, Laith Sakka, Milind Kulkarni, and Ryan R Newton. Local: a language for programs operating on serialized data. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 48-62, 2019. Google Scholar
  32. Dong Zhou, Bin Fan, Hyeontaek Lim, Michael Kaminsky, and David G Andersen. Scalable, high performance ethernet forwarding with cuckooswitch. In Proceedings of the ninth ACM conference on Emerging networking experiments and technologies, pages 97-108, 2013. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail