A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem

Authors Yiqiu Wang, Shangdi Yu, Yan Gu, Julian Shun



PDF
Thumbnail PDF

File

LIPIcs.SoCG.2021.60.pdf
  • Filesize: 0.98 MB
  • 16 pages

Document Identifiers

Author Details

Yiqiu Wang
  • Massachusetts Institute of Technology, Cambridge, MA, USA
Shangdi Yu
  • Massachusetts Institute of Technology, Cambridge, MA, USA
Yan Gu
  • University of California, Riverside, CA, USA
Julian Shun
  • Massachusetts Institute of Technology, Cambridge, MA, USA

Acknowledgements

We thank Yihan Sun for help on using the PAM library [Sun and Blelloch, 2019].

Cite As Get BibTex

Yiqiu Wang, Shangdi Yu, Yan Gu, and Julian Shun. A Parallel Batch-Dynamic Data Structure for the Closest Pair Problem. In 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 189, pp. 60:1-60:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021) https://doi.org/10.4230/LIPIcs.SoCG.2021.60

Abstract

We propose a theoretically-efficient and practical parallel batch-dynamic data structure for the closest pair problem. Our solution is based on a serial dynamic closest pair data structure by Golin et al., and supports batches of insertions and deletions in parallel. For a data set of size n, our data structure supports a batch of insertions or deletions of size m in O(m(1+log ((n+m)/m))) expected work and O(log (n+m)log^*(n+m)) depth with high probability, and takes linear space. The key techniques for achieving these bounds are a new work-efficient parallel batch-dynamic binary heap, and careful management of the computation across sets of points to minimize work and depth.
We provide an optimized multicore implementation of our data structure using dynamic hash tables, parallel heaps, and dynamic k-d trees. Our experiments on a variety of synthetic and real-world data sets show that it achieves a parallel speedup of up to 38.57x (15.10x on average) on 48 cores with hyper-threading. In addition, we also implement and compare four parallel algorithms for static closest pair problem, for which we are not aware of any existing practical implementations. On 48 cores with hyper-threading, the static algorithms achieve up to 51.45x (29.42x on average) speedup, and Rabin’s algorithm performs the best on average. Comparing our dynamic algorithm to the fastest static algorithm, we find that it is advantageous to use the dynamic algorithm for batch sizes of up to 20% of the data set. As far as we know, our work is the first to experimentally evaluate parallel closest pair algorithms, in both the static and the dynamic settings.

Subject Classification

ACM Subject Classification
  • Theory of computation → Shared memory algorithms
  • Computing methodologies → Shared memory algorithms
  • Theory of computation → Computational geometry
Keywords
  • Closest Pair
  • Parallel Algorithms
  • Dynamic Algorithms
  • Experimental Algorithms

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Chem dataset. URL: https://archive.ics.uci.edu/ml/datasets/Gas+sensor+array+under+dynamic+gas+mixtures.
  2. Pankaj K. Agarwal, Haim Kaplan, and Micha Sharir. Kinetic and dynamic data structures for closest pair and all nearest neighbors. ACM Transactions on Algorithms (TALG), 5(1):1-37, 2008. Google Scholar
  3. Suguru Arimoto and Hiroshi Noborio. A 3d closest pair algorithm and its applications to robot motion planning. IFAC Proceedings Volumes, 21(16):471-480, 1988. Google Scholar
  4. Mikhail J. Atallah and Michael T. Goodrich. Efficient parallel solutions to some geometric problems. J. Parallel Distrib. Comput., 3(4):492-507, 1986. Google Scholar
  5. Bahareh Banyassady and Wolfgang Mulzer. A simple analysis of Rabin’s algorithm for finding closest pairs. European Workshop on Computational Geometry (EuroCG), 2007. Google Scholar
  6. Jon L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509-517, 1975. Google Scholar
  7. Jon L. Bentley. Multidimensional divide-and-conquer. Commun. ACM, 23(4):214-229, 1980. Google Scholar
  8. Jon L. Bentley and Michael I. Shamos. Divide-and-conquer in multidimensional space. In ACM Symposium on Theory of Computing (STOC), pages 220-230, 1976. Google Scholar
  9. Sergei N. Bespamyatnikh. An optimal algorithm for closest-pair maintenance. Discrete & Computational Geometry, 19(2):175-195, 1998. Google Scholar
  10. Guy E. Blelloch, Yan Gu, Julian Shun, and Yihan Sun. Parallelism in randomized incremental algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), page 467–478, 2016. Google Scholar
  11. Guy E. Blelloch and Bruce M. Maggs. Parallel algorithms. In Algorithms and Theory of Computation Handbook: Special Topics and Techniques, pages 25-25. Chapman & Hall/CRC, 2010. Google Scholar
  12. Richard P. Brent. The parallel evaluation of general arithmetic expressions. J. ACM, 21(2):201-206, 1974. Google Scholar
  13. Paul B. Callahan and S. Rao Kosaraju. Algorithms for dynamic closest pair and n-body potential fields. In ACM-SIAM Symposium on Discrete Algorithms (SODA), page 263–272, 1995. Google Scholar
  14. Timothy M. Chan. Geometric applications of a randomized optimization technique. Discrete & Computational Geometry, 22(4):547-567, 1999. Google Scholar
  15. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms (3rd edition). MIT Press, 2009. Google Scholar
  16. Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. A reliable randomized algorithm for the closest-pair problem. J. Algorithms, 25(1):19-51, 1997. Google Scholar
  17. Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL: http://archive.ics.uci.edu/ml.
  18. David Eppstein. Fast hierarchical clustering and other applications of dynamic closest pairs. J. Experimental Algorithmics, 5:1-es, 2000. Google Scholar
  19. Jordi Fonollosa, Sadique Sheik, Ramón Huerta, and Santiago Marco. Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sensors and Actuators B: Chemical, 215:618-629, 2015. Google Scholar
  20. Steve Fortune and John Hopcroft. A note on rabin’s nearest-neighbor algorithm. Information Processing Letters, 8(1):20-23, 1979. Google Scholar
  21. Junhao Gan and Yufei Tao. On the hardness and approximation of euclidean DBSCAN. ACM Transactions Database Systems, 42(3):14:1-14:45, 2017. Google Scholar
  22. Joseph Gil, Yossi Matias, and Uzi Vishkin. Towards a theory of nearly constant time parallel algorithms. In IEEE Symposium on Foundations of Computer Science (FOCS), pages 698-710, 1991. Google Scholar
  23. Mordecai Golin, Rajeev Raman, Christian Schwarz, and Michiel Smid. Simple randomized algorithms for closest pair problems. Nordic J. of Computing, 2(1):3–27, 1995. Google Scholar
  24. Mordecai Golin, Rajeev Raman, Christian Schwarz, and Michiel Smid. Randomized data structures for the dynamic closest-pair problem. SIAM J. Scientific Computing, 27(4):1036-1072, 1998. Google Scholar
  25. Klaus Hinrichs, Jurg Nievergelt, and Peter Schorn. Plane-sweep solves the closest pair problem elegantly. Information Processing Letters, 26(5):255-261, 1988. Google Scholar
  26. Joseph JaJa. Introduction to Parallel Algorithms. Addison-Wesley Professional, 1992. Google Scholar
  27. Sanjiv Kapoor and Michiel Smid. New techniques for exact and approximate dynamic closest-point problems. SIAM J. Scientific Computing, 25(4):775-796, 1996. Google Scholar
  28. Samir Khuller and Yossi Matias. A simple randomized sieve algorithm for the closest-pair problem. Information and Computation, 118(1):34–37, April 1995. Google Scholar
  29. YongChul Kwon, Dylan Nunley, Jeffrey P. Gardner, Magdalena Balazinska, Bill Howe, and Sarah Loebman. Scalable clustering algorithm for N-body simulations in a shared-nothing cluster. In Scientific and Statistical Database Management, pages 132-150, 2010. Google Scholar
  30. Md. Nasir Uddin Laskar and TaeChoong Chung. Mobile robot path planning : an efficient distance computation between obstacles using discrete boundary model (dbm), 2012. Google Scholar
  31. Charles E. Leiserson. The Cilk++ concurrency platform. J. Supercomputing, 51(3), 2010. Google Scholar
  32. Hans-Peter Lenhof and Michiel Smid. Sequential and parallel algorithms for the k closest pairs problem. International J. of Computational Geometry & Applications, 5(03):273-288, 1995. Google Scholar
  33. Philip D. MacKenzie and Quentin F. Stout. Ultrafast expected time parallel algorithms. J. Algorithms, 26(1):1-33, 1998. Google Scholar
  34. Mark H. Overmars. Dynamization of order decomposable set problems. J. Algorithms, 2(3):245-260, 1981. Google Scholar
  35. Maria Cristina Pinotti and Geppino Pucci. Parallel algorithms for priority queue operations. Theoretical Computer Science (TCS), 148(1):171-180, 1995. Google Scholar
  36. Michael O. Rabin. Probabilistic algorithms, 1976. Google Scholar
  37. Sanguthevar Rajasekaran and Sudipta Pathak. Efficient algorithms for the closest pair problem and applications. arXiv preprint, 2014. URL: http://arxiv.org/abs/1407.5609.
  38. Sanguthevar Rajasekaran and John H. Reif. Optimal and sublogarithmic time randomized parallel sorting algorithms. SIAM J. Scientific Computing, 18(3):594-607, 1989. Google Scholar
  39. Christian Schwarz and Michiel Smid. An O(n log n log log n) algorithm for the on-line closest pair problem. In ACM-SIAM Symposium on Discrete Algorithms (SODA), SODA '92, page 280–285, USA, 1992. Society for Industrial and Applied Mathematics. Google Scholar
  40. Christian Schwarz, Michiel Smid, and Jack Snoeyink. An optimal algorithm for the on-line closest-pair problem. Algorithmica, 12(1):18-29, 1994. Google Scholar
  41. Michael I. Shamos and Dan Hoey. Closest-point problems. In IEEE Symposium on Foundations of Computer Science (FOCS), pages 151-162, 1975. Google Scholar
  42. Michiel Smid. Maintaining the minimal distance of a point set in less than linear time. In Algorithms Rev., pages 33-44, 1991. Google Scholar
  43. Michiel Smid. Maintaining the minimal distance of a point set in polylogarithmic time. In ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1-6, 1991. Google Scholar
  44. Michiel Smid. Closest-point problems in computational geometry. In Handbook of Computational Geometry, pages 877-935. North Holland / Elsevier, 2000. Google Scholar
  45. Yihan Sun and Guy E. Blelloch. Parallel range, segment and rectangle queries with augmented maps. In Algorithm Engineering and Experiments (ALENEX), pages 159-173, 2019. Google Scholar
  46. Kenneth J. Supowit. New techniques for some dynamic closest-point and farthest-point problems. In ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 84-90, 1990. Google Scholar
  47. Uzi Vishkin. Thinking in parallel: Some basic data-parallel algorithms. University of Maryland, 2010. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail