Parallel Finger Search Structures

Authors Seth Gilbert, Wei Quan Lim



PDF
Thumbnail PDF

File

LIPIcs.DISC.2019.20.pdf
  • Filesize: 0.57 MB
  • 18 pages

Document Identifiers

Author Details

Seth Gilbert
  • Computer Science, National University of Singapore
Wei Quan Lim
  • Computer Science, National University of Singapore

Acknowledgements

We would like to express our gratitude to our families and friends for their wholehearted support, to the kind reviewers who provided helpful feedback, and to all others who have given us valuable comments and advice.

Cite AsGet BibTex

Seth Gilbert and Wei Quan Lim. Parallel Finger Search Structures. In 33rd International Symposium on Distributed Computing (DISC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 146, pp. 20:1-20:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.DISC.2019.20

Abstract

In this paper we present two versions of a parallel finger structure FS on p processors that supports searches, insertions and deletions, and has a finger at each end. This is to our knowledge the first implementation of a parallel search structure that is work-optimal with respect to the finger bound and yet has very good parallelism (within a factor of O(log p)^2) of optimal). We utilize an extended implicit batching framework that transparently facilitates the use of FS by any parallel program P that is modelled by a dynamically generated DAG D where each node is either a unit-time instruction or a call to FS. The work done by FS is bounded by the finger bound F_L (for some linearization L of D), i.e. each operation on an item with distance r from a finger takes O(log r+1) amortized work. Running P using the simpler version takes O((T_1+F_L)/p + T_infty + d * ((log p)^2 + log n)) time on a greedy scheduler, where T_1, T_infty are the size and span of D respectively, and n is the maximum number of items in FS, and d is the maximum number of calls to FS along any path in D. Using the faster version, this is reduced to O((T_1+F_L)/p + T_infty + d *(log p)^2 + s_L) time, where s_L is the weighted span of D where each call to FS is weighted by its cost according to F_L. FS can be extended to a fixed number of movable fingers. The data structures in our paper fit into the dynamic multithreading paradigm, and their performance bounds are directly composable with other data structures given in the same paradigm. Also, the results can be translated to practical implementations using work-stealing schedulers.

Subject Classification

ACM Subject Classification
  • Theory of computation → Parallel algorithms
  • Theory of computation → Shared memory algorithms
  • Theory of computation → Parallel computing models
Keywords
  • Parallel data structures
  • Multithreading
  • Dictionaries
  • Comparison-based Search
  • Distribution-sensitive algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Yehuda Afek, Haim Kaplan, Boris Korenfeld, Adam Morrison, and Robert E Tarjan. The CB tree: a practical concurrent self-adjusting search tree. Distributed computing, 27(6):393-417, 2014. Google Scholar
  2. Yehuda Afek, Haim Kaplan, Boris Korenfeld, Adam Morrison, and Robert Endre Tarjan. CBTree: A Practical Concurrent Self-Adjusting Search Tree. In DISC, volume 7611 of Lecture Notes in Computer Science, pages 1-15. Springer, 2012. Google Scholar
  3. Kunal Agrawal, Jeremy T Fineman, Kefu Lu, Brendan Sheridan, Jim Sukha, and Robert Utterback. Provably good scheduling for parallel programs that use data structures through implicit batching. In Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures, pages 84-95. ACM, 2014. Google Scholar
  4. Kunal Agrawal, Seth Gilbert, and Wei Quan Lim. Parallel Working-Set Search Structures. In Proceedings of the 30th ACM symposium on Parallelism in algorithms and architectures, pages 321-332. ACM, 2018. URL: http://arxiv.org/abs/1805.05787.
  5. Yaroslav Akhremtsev and Peter Sanders. Fast parallel operations on search trees. In 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), pages 291-300. IEEE, 2016. Google Scholar
  6. Vitaly Aksenov, Petr Kuznetsov, and Anatoly Shalyto. Parallel Combining: Benefits of Explicit Synchronization. In Jiannong Cao, Faith Ellen, Luis Rodrigues, and Bernardo Ferreira, editors, 22nd International Conference on Principles of Distributed Systems (OPODIS 2018), volume 125 of Leibniz International Proceedings in Informatics (LIPIcs), pages 11:1-11:16, Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/LIPIcs.OPODIS.2018.11.
  7. Guy E Blelloch, Daniel Ferizovic, and Yihan Sun. Just join for parallel ordered sets. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, pages 253-264. ACM, 2016. Google Scholar
  8. Guy E Blelloch, Jeremy T Fineman, Yan Gu, and Yihan Sun. Optimal Parallel Algorithms in the Binary-Forking Model. arXiv preprint, 2019. URL: http://arxiv.org/abs/1903.04650.
  9. Guy E. Blelloch and Margaret Reid-Miller. Pipelining with Futures. In Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures, SPAA '97, pages 249-259, New York, NY, USA, 1997. ACM. URL: https://doi.org/10.1145/258492.258517.
  10. Guy E. Blelloch and Margaret Reid-Miller. Fast Set Operations Using Treaps. In Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures, pages 16-26, 1998. URL: https://doi.org/10.1145/277651.277660.
  11. Trevor Brown, Faith Ellen, and Eric Ruppert. A general technique for non-blocking trees. In ACM SIGPLAN Notices, volume 49, pages 329-342. ACM, 2014. Google Scholar
  12. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, third edition, 2009. Google Scholar
  13. Cynthia Dwork, Maurice Herlihy, and Orli Waarts. Contention in shared memory algorithms. Journal of the ACM (JACM), 44(6):779-805, 1997. Google Scholar
  14. Faith Ellen, Panagiota Fatourou, Joanna Helga, and Eric Ruppert. The amortized complexity of non-blocking binary search trees. In Proceedings of the 2014 ACM symposium on Principles of distributed computing, pages 332-340. ACM, 2014. Google Scholar
  15. Faith Ellen, Panagiota Fatourou, Eric Ruppert, and Franck van Breugel. Non-blocking Binary Search Trees. In Proceedings of the 29th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing, PODC '10, pages 131-140, New York, NY, USA, 2010. ACM. URL: https://doi.org/10.1145/1835698.1835736.
  16. Stephan Erb, Moritz Kobitzsch, and Peter Sanders. Parallel bi-objective shortest paths using weight-balanced b-trees with bulk updates. In International Symposium on Experimental Algorithms, pages 111-122. Springer, 2014. Google Scholar
  17. Panagiota Fatourou and Nikolaos D. Kallimanis. Revisiting the combining synchronization technique. In PPoPP, pages 257-266, 2012. URL: https://doi.org/10.1145/2145816.2145849.
  18. Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 212-223, 1998. Google Scholar
  19. Michael T Goodrich and S Rao Kosaraju. Sorting on a parallel pointer machine with applications to set expression evaluation. Journal of the ACM (JACM), 43(2):331-361, 1996. Google Scholar
  20. Leo J Guibas, Edward M McCreight, Michael F Plass, and Janet R Roberts. A new representation for linear lists. In Proceedings of the ninth annual ACM symposium on Theory of computing, pages 49-60. ACM, 1977. Google Scholar
  21. Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. Flat combining and the synchronization-parallelism tradeoff. In Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 355-364, 2010. URL: https://doi.org/10.1145/1810479.1810540.
  22. John Iacono. Alternatives to splay trees with O(log n) worst-case access times. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, pages 516-522. Society for Industrial and Applied Mathematics, 2001. Google Scholar
  23. Intel Corporation. Intel Cilk Plus Language Extension Specification, Version 1.1, 2013. Document 324396-002US. Available from URL: http://cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_2.htm.
  24. Wei Quan Lim. Optimal Multithreaded Batch-Parallel 2-3 Trees. arXiv, 2019. URL: http://arxiv.org/abs/1905.05254.
  25. OpenMP Architecture Review Board. OpenMP application program interface, version 4.0. Available from http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf, July 2013.
  26. Y. Oyama, K. Taura, and A. Yonezawa. Executing Parallel Programs With Synchronization Bottlenecks Efficiently. In Proceedings of the International Workshop on Parallel and Distributed Computing for Symbolic and Irregular Applications (PDSIA), pages 182-204, 1999. Google Scholar
  27. Wolfgang Paul, Uzi Vishkin, and Hubert Wagener. Parallel dictionaries on 2-3 trees. Automata, Languages and Programming, pages 597-609, 1983. Google Scholar
  28. James Reinders. Intel Threading Building Blocks: Outfitting C++ for Multi-Core Processor Parallelism. O'Reilly, 2007. Google Scholar
  29. Daniel Dominic Sleator and Robert Endre Tarjan. Self-adjusting binary search trees. Journal of the ACM (JACM), 32(3):652-686, 1985. Google Scholar
  30. The Task Parallel Library. http://msdn.microsoft.com/en-us/magazine/cc163340.aspx, October 2007.
  31. Thomas Tseng, Laxman Dhulipala, and Guy Blelloch. Batch-Parallel Euler Tour Trees. In 2019 Proceedings of the Twenty-First Workshop on Algorithm Engineering and Experiments (ALENEX), pages 92-106. SIAM, 2019. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail