Scalable Distributed String Sorting

Authors Florian Kurpicz , Pascal Mehnert , Peter Sanders , Matthias Schimek

Author Details

Florian Kurpicz
  • Karlsruhe Institute of Technology, Germany
Pascal Mehnert
  • Independent, Germany
Peter Sanders
  • Karlsruhe Institute of Technology, Germany
Matthias Schimek
  • Karlsruhe Institute of Technology, Germany

Florian Kurpicz, Pascal Mehnert, Peter Sanders, and Matthias Schimek. Scalable Distributed String Sorting. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 83:1-83:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


String sorting is an important part of tasks such as building index data structures. Unfortunately, current string sorting algorithms do not scale to massively parallel distributed-memory machines since they either have latency (at least) proportional to the number of processors p or communicate the data a large number of times (at least logarithmic). We present practical and efficient algorithms for distributed-memory string sorting that scale to large p. Similar to state-of-the-art sorters for atomic objects, the algorithms have latency of about p^{1/k} when allowing the data to be communicated k times. Experiments indicate good scaling behavior on a wide range of inputs on up to 49152 cores. Overall, we achieve speedups of up to 4.9 over the current state-of-the-art distributed string sorting algorithms.

Subject Classification

ACM Subject Classification
  • Theory of computation → Sorting and searching
  • Theory of computation → Massively parallel algorithms
  • Computing methodologies → Distributed algorithms
  • Theory of computation → Bloom filters and hashing
  • sorting
  • strings
  • distributed-memory computing
  • distributed membership filters
  • scalability


