BlockQuicksort: Avoiding Branch Mispredictions in Quicksort

Authors Stefan Edelkamp, Armin Weiss



PDF
Thumbnail PDF

File

LIPIcs.ESA.2016.38.pdf
  • Filesize: 0.6 MB
  • 16 pages

Document Identifiers

Author Details

Stefan Edelkamp
Armin Weiss

Cite As Get BibTex

Stefan Edelkamp and Armin Weiss. BlockQuicksort: Avoiding Branch Mispredictions in Quicksort. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 38:1-38:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016) https://doi.org/10.4230/LIPIcs.ESA.2016.38

Abstract

Since the work of Kaligosi and Sanders (2006), it is well-known that Quicksort - which is commonly considered as one of the fastest in-place sorting algorithms - suffers in an essential way from branch mispredictions.  We present a novel approach to address this problem by partially decoupling control from data flow: in order to perform the partitioning, we split the input in blocks of constant size (we propose 128 data elements); then, all elements in one block are compared with the pivot and the outcomes of the comparisons are stored in a buffer. In a second pass, the respective elements are rearranged. By doing so, we avoid conditional branches based on outcomes of comparisons at all (except for the final Insertionsort). Moreover, we prove that for a static branch predictor the average total number of branch mispredictions is at most epsilon n log n + O(n) for some small epsilon depending on the block size when sorting n elements.

Our experimental results are promising: when sorting random integer data, we achieve an increase in speed (number of elements sorted per second) of more than 80% over the GCC implementation of C++ std::sort. Also for many other types of data and non-random inputs, there is still a significant speedup over std::sort. Only in few special cases like sorted or almost sorted inputs, std::sort can beat our implementation. Moreover, even on random input permutations, our implementation is even slightly faster than an implementation of the highly tuned Super Scalar Sample Sort, which uses a linear amount of additional space.

Subject Classification

Keywords
  • in-place sorting
  • Quicksort
  • branch mispredictions
  • lean programs

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. D. Abhyankar and M. Ingle. Engineering of a quicksort partitioning algorithm. Journal of Global Research in Computer Science, 2(2):17-23, 2011. Google Scholar
  2. ARMv8 Instruction Set Overview, 2011. Document number: PRD03-GENC-010197 15.0. Google Scholar
  3. Martin Aumüller and Martin Dietzfelbinger. Optimal partitioning for dual pivot quicksort - (extended abstract). In ICALP, pages 33-44, 2013. Google Scholar
  4. Martin Aumüller, Martin Dietzfelbinger, and Pascal Klaue. How good is multi-pivot quicksort? CoRR, abs/1510.04676, 2015. Google Scholar
  5. Paul Biggar, Nicholas Nash, Kevin Williams, and David Gregg. An experimental study of sorting and branch prediction. J. Exp. Algorithmics, 12:1.8:1-39, 2008. Google Scholar
  6. Gerth Stølting Brodal, Rolf Fagerberg, and Kristoffer Vinther. Engineering a cache-oblivious sorting algorithm. J. Exp. Algorithmics, 12:2.2:1-23, 2008. Google Scholar
  7. Gerth Stølting Brodal and Gabriel Moruz. Tradeoffs between branch mispredictions and comparisons for sorting algorithms. In WADS, volume 3608 of LNCS, pages 385-395. Springer, 2005. Google Scholar
  8. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. The MIT Press, 3nd edition, 2009. Google Scholar
  9. Stefan Edelkamp and Armin Weiß. Blockquicksort: How branch mispredictions don't affect quicksort. CoRR, abs/1604.06697, 2016. Google Scholar
  10. Amr Elmasry and Jyrki Katajainen. Lean programs, branch mispredictions, and sorting. In FUN, volume 7288 of LNCS, pages 119-130. Springer, 2012. Google Scholar
  11. Amr Elmasry, Jyrki Katajainen, and Max Stenmark. Branch mispredictions don't affect mergesort. In SEA, pages 160-171, 2012. Google Scholar
  12. Robert W. Floyd. Algorithm 245: Treesort 3. Comm. of the ACM, 7(12):701, 1964. Google Scholar
  13. John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 5th edition, 2011. Google Scholar
  14. Charles A. R. Hoare. Quicksort. The Computer Journal, 5(1):10-16, 1962. Google Scholar
  15. Intel 64 and IA-32 Architecture Optimization Reference Manual, 2016. Order Number: 248966-032. Google Scholar
  16. Kanela Kaligosi and Peter Sanders. How branch mispredictions affect quicksort. In ESA, pages 780-791, 2006. Google Scholar
  17. Jyrki Katajainen. Sorting programs executing fewer branches. CPH STL Report 2263887503, Department of Computer Science, University of Copenhagen, 2014. Google Scholar
  18. Donald E. Knuth. Sorting and Searching, volume 3 of The Art of Computer Programming. Addison Wesley Longman, 2nd edition, 1998. Google Scholar
  19. Shrinu Kushagra, Alejandro López-Ortiz, Aurick Qiao, and J. Ian Munro. Multi-pivot quicksort: Theory and experiments. In ALENEX, pages 47-60, 2014. Google Scholar
  20. Anthony LaMarca and Richard E Ladner. The influence of caches on the performance of sorting. J. Algorithms, 31(1):66-104, 1999. Google Scholar
  21. Conrado Martínez, Markus E. Nebel, and Sebastian Wild. Analysis of branch misses in quicksort. In Workshop on Analytic Algorithmics and Combinatorics, ANALCO 2015, San Diego, CA, USA, January 4, 2015, pages 114-128, 2015. Google Scholar
  22. Conrado Martínez and Salvador Roura. Optimal Sampling Strategies in Quicksort and Quickselect. SIAM J. Comput., 31(3):683-705, 2001. URL: http://dx.doi.org/10.1137/S0097539700382108.
  23. David R. Musser. Introspective sorting and selection algorithms. Software - Practice and Experience, 27(8):983-993, 1997. Google Scholar
  24. Charles Price. MIPS IV Instruction Set, 1995. Google Scholar
  25. Peter Sanders and Sebastian Winkel. Super Scalar Sample Sort. In ESA, pages 784-796, 2004. Google Scholar
  26. Robert Sedgewick. The analysis of quicksort programs. Acta Inf., 7(4):327-355, 1977. Google Scholar
  27. Robert Sedgewick. Implementing quicksort programs. Commun. ACM, 21(10):847-857, 1978. Google Scholar
  28. Sebastian Wild and Markus E. Nebel. Average case analysis of java 7’s dual pivot quicksort. In ESA, pages 825-836, 2012. Google Scholar
  29. Sebastian Wild, Markus E. Nebel, and Ralph Neininger. Average case and distributional analysis of dual-pivot quicksort. ACM Transactions on Algorithms, 11(3):22:1-42, 2015. Google Scholar
  30. J. W. J. Williams. Algorithm 232: Heapsort. Communications of the ACM, 7(6):347-348, 1964. Google Scholar
  31. Vladimir Yaroslavskiy. Dual-Pivot Quicksort algorithm, 2009. URL: http://codeblab.com/wp-content/uploads/2009/09/DualPivotQuicksort.pdf.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail