Funnelselect: Cache-Oblivious Multiple Selection

Authors Gerth Stølting Brodal , Sebastian Wild



PDF
Thumbnail PDF

File

LIPIcs.ESA.2023.25.pdf
  • Filesize: 0.88 MB
  • 17 pages

Document Identifiers

Author Details

Gerth Stølting Brodal
  • Aarhus University, Denmark
Sebastian Wild
  • University of Liverpool, UK

Cite AsGet BibTex

Gerth Stølting Brodal and Sebastian Wild. Funnelselect: Cache-Oblivious Multiple Selection. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 25:1-25:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ESA.2023.25

Abstract

We present the algorithm funnelselect, the first optimal randomized cache-oblivious algorithm for the multiple-selection problem. The algorithm takes as input an unsorted array of N elements and q query ranks r_1 < ⋯ < r_q, and returns in sorted order the q input elements of rank r_1, …, r_q, respectively. The algorithm uses expected and with high probability O(∑_{i = 1}^{q+1} Δ_i/B ⋅ log_{M/B} N/(Δ_i) + N/B) I/Os, where B is the external memory block size, M ≥ B^{1+ε} is the internal memory size, for some constant ε > 0, and Δ_i = r_i - r_{i-1} (assuming r_0 = 0 and r_{q+1} = N + 1). This is the best possible I/O bound in the cache-oblivious and external memory models. The result is achieved by reversing the computation of the cache-oblivious sorting algorithm funnelsort by Frigo, Leiserson, Prokop and Ramachandran [FOCS 1999], using randomly selected pivots for distributing elements, and pruning computations that with high probability are not expected to contain any query ranks.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
Keywords
  • Multiple selection
  • cache-oblivious algorithm
  • randomized algorithm
  • entropy bounds

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Alok Aggarwal and Jeffrey Scott Vitter. The input/output complexity of sorting and related problems. Commun. ACM, 31(9):1116-1127, 1988. URL: https://doi.org/10.1145/48529.48535.
  2. Lars Arge, Mikael B. Knudsen, and Kirsten Larsen. A general lower bound on the I/O-complexity of comparison-based algorithms. In Frank K. H. A. Dehne, Jörg-Rüdiger Sack, Nicola Santoro, and Sue Whitesides, editors, Algorithms and Data Structures, Third Workshop, WADS '93, Montréal, Canada, August 11-13, 1993, Proceedings, volume 709 of Lecture Notes in Computer Science, pages 83-94. Springer, 1993. URL: https://doi.org/10.1007/3-540-57155-8_238.
  3. Jérémy Barbay, Ankur Gupta, Srinivasa Rao Satti, and Jon Sorenson. Near-optimal online multiselection in internal and external memory. Journal of Discrete Algorithms, 36:3-17, January 2016. URL: https://doi.org/10.1016/j.jda.2015.11.001.
  4. Manuel Blum, Robert W. Floyd, Vaughan R. Pratt, Ronald L. Rivest, and Robert Endre Tarjan. Time bounds for selection. J. Comput. Syst. Sci., 7(4):448-461, 1973. URL: https://doi.org/10.1016/S0022-0000(73)80033-9.
  5. Gerth Stølting Brodal and Rolf Fagerberg. Cache oblivious distribution sweeping. In Peter Widmayer, Francisco Triguero Ruiz, Rafael Morales Bueno, Matthew Hennessy, Stephan J. Eidenbenz, and Ricardo Conejo, editors, Automata, Languages and Programming, 29th International Colloquium, ICALP 2002, Malaga, Spain, July 8-13, 2002, Proceedings, volume 2380 of Lecture Notes in Computer Science, pages 426-438. Springer, 2002. URL: https://doi.org/10.1007/3-540-45465-9_37.
  6. Gerth Stølting Brodal and Rolf Fagerberg. On the limits of cache-obliviousness. In Lawrence L. Larmore and Michel X. Goemans, editors, Proceedings of the 35th Annual ACM Symposium on Theory of Computing, June 9-11, 2003, San Diego, CA, USA, pages 307-315. ACM, 2003. URL: https://doi.org/10.1145/780542.780589.
  7. J. M. Chambers. Partial sorting [M1] (algorithm 410). Commun. ACM, 14(5):357-358, 1971. URL: https://doi.org/10.1145/362588.362602.
  8. David P. Dobkin and J. Ian Munro. Optimal time minimal space selection algorithms. J. ACM, 28(3):454-461, 1981. URL: https://doi.org/10.1145/322261.322264.
  9. Dorit Dor and Uri Zwick. Selecting the median. SIAM Journal on Computing, 28(5):1722-1758, 1999. URL: https://doi.org/10.1137/s0097539795288611.
  10. Devdatt P. Dubhashi and Alessandro Panconesi. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press, 2009. URL: http://www.cambridge.org/gb/knowledge/isbn/item2327542/.
  11. Robert W. Floyd and Ronald L. Rivest. Expected time bounds for selection. Communications of the ACM, 18(3):165-172, March 1975. URL: https://doi.org/10.1145/360680.360691.
  12. Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, pages 285-298. IEEE Computer Society, 1999. URL: https://doi.org/10.1109/SFFCS.1999.814600.
  13. Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. ACM Trans. Algorithms, 8(1):4:1-4:22, 2012. URL: https://doi.org/10.1145/2071379.2071383.
  14. C. A. R. Hoare. Algorithm 63: partition. Commun. ACM, 4(7):321, 1961. URL: https://doi.org/10.1145/366622.366642.
  15. C. A. R. Hoare. Algorithm 64: quicksort. Commun. ACM, 4(7):321, 1961. URL: https://doi.org/10.1145/366622.366644.
  16. C. A. R. Hoare. Algorithm 65: find. Commun. ACM, 4(7):321-322, 1961. URL: https://doi.org/10.1145/366622.366647.
  17. Xiaocheng Hu, Yufei Tao, Yi Yang, and Shuigeng Zhou. Finding approximate partitions and splitters in external memory. In Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures. ACM, June 2014. URL: https://doi.org/10.1145/2612669.2612691.
  18. Kanela Kaligosi, Kurt Mehlhorn, J. Ian Munro, and Peter Sanders. Towards optimal multiple selection. In Luís Caires, Giuseppe F. Italiano, Luís Monteiro, Catuscia Palamidessi, and Moti Yung, editors, Automata, Languages and Programming, 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005, Proceedings, volume 3580 of Lecture Notes in Computer Science, pages 103-114. Springer, 2005. URL: https://doi.org/10.1007/11523468_9.
  19. Helmut Prodinger. Multiple Quickselect - Hoare’s Find algorithm for several elements. Information Processing Letters, 56(3):123-129, November 1995. URL: https://doi.org/10.1016/0020-0190(95)00150-b.
  20. Arnold Schönhage, Mike Paterson, and Nicholas Pippenger. Finding the median. J. Comput. Syst. Sci., 13(2):184-199, 1976. URL: https://doi.org/10.1016/S0022-0000(76)80029-3.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail