Parallel Finger Search Structures

Gilbert, Seth; Lim, Wei Quan

doi:10.4230/LIPIcs.DISC.2019.20

File

Author Details

Seth Gilbert

Computer Science, National University of Singapore

Wei Quan Lim

Computer Science, National University of Singapore

Cite AsGet BibTex

Seth Gilbert and Wei Quan Lim. Parallel Finger Search Structures. In 33rd International Symposium on Distributed Computing (DISC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 146, pp. 20:1-20:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.DISC.2019.20

Abstract

In this paper we present two versions of a parallel finger structure FS on p processors that supports searches, insertions and deletions, and has a finger at each end. This is to our knowledge the first implementation of a parallel search structure that is work-optimal with respect to the finger bound and yet has very good parallelism (within a factor of O(log p)^2) of optimal). We utilize an extended implicit batching framework that transparently facilitates the use of FS by any parallel program P that is modelled by a dynamically generated DAG D where each node is either a unit-time instruction or a call to FS. The work done by FS is bounded by the finger bound F_L (for some linearization L of D), i.e. each operation on an item with distance r from a finger takes O(log r+1) amortized work. Running P using the simpler version takes O((T_1+F_L)/p + T_infty + d * ((log p)^2 + log n)) time on a greedy scheduler, where T_1, T_infty are the size and span of D respectively, and n is the maximum number of items in FS, and d is the maximum number of calls to FS along any path in D. Using the faster version, this is reduced to O((T_1+F_L)/p + T_infty + d *(log p)^2 + s_L) time, where s_L is the weighted span of D where each call to FS is weighted by its cost according to F_L. FS can be extended to a fixed number of movable fingers. The data structures in our paper fit into the dynamic multithreading paradigm, and their performance bounds are directly composable with other data structures given in the same paradigm. Also, the results can be translated to practical implementations using work-stealing schedulers.

Subject Classification

ACM Subject Classification

Theory of computation → Parallel algorithms
Theory of computation → Shared memory algorithms
Theory of computation → Parallel computing models

Keywords

Parallel data structures
Multithreading
Dictionaries
Comparison-based Search
Distribution-sensitive algorithms

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Yehuda Afek, Haim Kaplan, Boris Korenfeld, Adam Morrison, and Robert E Tarjan. The CB tree: a practical concurrent self-adjusting search tree. Distributed computing, 27(6):393-417, 2014.
Yehuda Afek, Haim Kaplan, Boris Korenfeld, Adam Morrison, and Robert Endre Tarjan. CBTree: A Practical Concurrent Self-Adjusting Search Tree. In DISC, volume 7611 of Lecture Notes in Computer Science, pages 1-15. Springer, 2012.
Kunal Agrawal, Jeremy T Fineman, Kefu Lu, Brendan Sheridan, Jim Sukha, and Robert Utterback. Provably good scheduling for parallel programs that use data structures through implicit batching. In Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures, pages 84-95. ACM, 2014.
Kunal Agrawal, Seth Gilbert, and Wei Quan Lim. Parallel Working-Set Search Structures. In Proceedings of the 30th ACM symposium on Parallelism in algorithms and architectures, pages 321-332. ACM, 2018. URL: http://arxiv.org/abs/1805.05787.
Yaroslav Akhremtsev and Peter Sanders. Fast parallel operations on search trees. In 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), pages 291-300. IEEE, 2016.
Vitaly Aksenov, Petr Kuznetsov, and Anatoly Shalyto. Parallel Combining: Benefits of Explicit Synchronization. In Jiannong Cao, Faith Ellen, Luis Rodrigues, and Bernardo Ferreira, editors, 22nd International Conference on Principles of Distributed Systems (OPODIS 2018), volume 125 of Leibniz International Proceedings in Informatics (LIPIcs), pages 11:1-11:16, Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.OPODIS.2018.11.
Guy E Blelloch, Daniel Ferizovic, and Yihan Sun. Just join for parallel ordered sets. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, pages 253-264. ACM, 2016.
Guy E Blelloch, Jeremy T Fineman, Yan Gu, and Yihan Sun. Optimal Parallel Algorithms in the Binary-Forking Model. arXiv preprint, 2019. URL: http://arxiv.org/abs/1903.04650.
Guy E. Blelloch and Margaret Reid-Miller. Pipelining with Futures. In Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, SPAA '97, pages 249-259, New York, NY, USA, 1997. ACM. URL: https://doi.org/10.1145/258492.258517.
Guy E. Blelloch and Margaret Reid-Miller. Fast Set Operations Using Treaps. In Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, pages 16-26, 1998. URL: https://doi.org/10.1145/277651.277660.
Trevor Brown, Faith Ellen, and Eric Ruppert. A general technique for non-blocking trees. In ACM SIGPLAN Notices, volume 49, pages 329-342. ACM, 2014.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, third edition, 2009.
Cynthia Dwork, Maurice Herlihy, and Orli Waarts. Contention in shared memory algorithms. Journal of the ACM (JACM), 44(6):779-805, 1997.
Faith Ellen, Panagiota Fatourou, Joanna Helga, and Eric Ruppert. The amortized complexity of non-blocking binary search trees. In Proceedings of the 2014 ACM symposium on Principles of distributed computing, pages 332-340. ACM, 2014.
Faith Ellen, Panagiota Fatourou, Eric Ruppert, and Franck van Breugel. Non-blocking Binary Search Trees. In Proceedings of the 29th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, PODC '10, pages 131-140, New York, NY, USA, 2010. ACM. URL: https://doi.org/10.1145/1835698.1835736.
Stephan Erb, Moritz Kobitzsch, and Peter Sanders. Parallel bi-objective shortest paths using weight-balanced b-trees with bulk updates. In International Symposium on Experimental Algorithms, pages 111-122. Springer, 2014.
Panagiota Fatourou and Nikolaos D. Kallimanis. Revisiting the combining synchronization technique. In PPoPP, pages 257-266, 2012. URL: https://doi.org/10.1145/2145816.2145849.
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 212-223, 1998.
Michael T Goodrich and S Rao Kosaraju. Sorting on a parallel pointer machine with applications to set expression evaluation. Journal of the ACM (JACM), 43(2):331-361, 1996.
Leo J Guibas, Edward M McCreight, Michael F Plass, and Janet R Roberts. A new representation for linear lists. In Proceedings of the ninth annual ACM symposium on Theory of computing, pages 49-60. ACM, 1977.
Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 355-364, 2010. URL: https://doi.org/10.1145/1810479.1810540.
John Iacono. Alternatives to splay trees with O(log n) worst-case access times. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, pages 516-522. Society for Industrial and Applied Mathematics, 2001.
Intel Corporation. Intel Cilk Plus Language Extension Specification, Version 1.1, 2013. Document 324396-002US. Available from URL: http://cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_2.htm.
Wei Quan Lim. Optimal Multithreaded Batch-Parallel 2-3 Trees. arXiv, 2019. URL: http://arxiv.org/abs/1905.05254.
OpenMP Architecture Review Board. OpenMP application program interface, version 4.0. Available from http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf, July 2013.
Y. Oyama, K. Taura, and A. Yonezawa. Executing Parallel Programs With Synchronization Bottlenecks Efficiently. In Proceedings of the International Workshop on Parallel and Distributed Computing for Symbolic and Irregular Applications (PDSIA), pages 182-204, 1999.
Wolfgang Paul, Uzi Vishkin, and Hubert Wagener. Parallel dictionaries on 2-3 trees. Automata, Languages and Programming, pages 597-609, 1983.
James Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. O'Reilly, 2007.
Daniel Dominic Sleator and Robert Endre Tarjan. Self-adjusting binary search trees. Journal of the ACM (JACM), 32(3):652-686, 1985.
The Task Parallel Library. http://msdn.microsoft.com/en-us/magazine/cc163340.aspx, October 2007.
Thomas Tseng, Laxman Dhulipala, and Guy Blelloch. Batch-Parallel Euler Tour Trees. In 2019 Proceedings of the Twenty-First Workshop on Algorithm Engineering and Experiments (ALENEX), pages 92-106. SIAM, 2019.

Parallel Finger Search Structures

Authors Seth Gilbert, Wei Quan Lim

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Parallel Finger Search Structures

Authors Seth Gilbert, Wei Quan Lim

File

Document Identifiers

Author Details

Funding

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message