Document Open Access Logo

High-Quality Hierarchical Process Mapping

Authors Marcelo Fonseca Faraj , Alexander van der Grinten , Henning Meyerhenke , Jesper Larsson Träff , Christian Schulz

Thumbnail PDF


  • Filesize: 0.55 MB
  • 15 pages

Document Identifiers

Author Details

Marcelo Fonseca Faraj
  • Faculty of Computer Science, University of Vienna, Austria
Alexander van der Grinten
  • Humboldt-Universität zu Berlin, Germany
Henning Meyerhenke
  • Humboldt-Universität zu Berlin, Germany
Jesper Larsson Träff
  • Faculty of Informatics, TU Wien, Vienna, Austria
Christian Schulz
  • Faculty of Computer Science, University of Vienna, Austria

Cite AsGet BibTex

Marcelo Fonseca Faraj, Alexander van der Grinten, Henning Meyerhenke, Jesper Larsson Träff, and Christian Schulz. High-Quality Hierarchical Process Mapping. In 18th International Symposium on Experimental Algorithms (SEA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 160, pp. 4:1-4:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)


Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation when processing graphs on a parallel computer. When a topology of a distributed system is known, an important task is then to map the blocks of the partition onto the processors such that the overall communication cost is reduced. We present novel multilevel algorithms that integrate graph partitioning and process mapping. Important ingredients of our algorithm include fast label propagation, more localized local search, initial partitioning, as well as a compressed data structure to compute processor distances without storing a distance matrix. Moreover, our algorithms are able to exploit a given hierarchical structure of the distributed system under consideration. Experiments indicate that our algorithms speed up the overall mapping process and, due to the integrated multilevel approach, also find much better solutions in practice. For example, one configuration of our algorithm yields similar solution quality as the previous state-of-the-art in terms of mapping quality for large numbers of partitions while being a factor 9.3 faster. Compared to the currently fastest iterated multilevel mapping algorithm Scotch, we obtain 16% better solutions while investing slightly more running time.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
  • Process Mapping
  • Graph Partitioning
  • Algorithm Engineering


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. D. A. Bader, H. Meyerhenke, P. Sanders, C. Schulz, A. Kappes, and D. Wagner. Benchmarking for graph clustering and partitioning. In Encyclopedia of Social Network Analysis and Mining, pages 73-82. Springer, 2014. Google Scholar
  2. C. Bichot and P. Siarry, editors. Graph Partitioning. Wiley, 2011. Google Scholar
  3. B. Brandfass, T. Alrutz, and T. Gerhold. Rank reordering for MPI communication optimization. Computers & Fluids, 80:372-380, 2013. Google Scholar
  4. A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz. Recent Advances in Graph Partitioning, pages 117-158. Springer International Publishing, Cham, 2016. URL:
  5. T. A. Davis and Y. Hu. The university of florida sparse matrix collection. ACM Trans. Math. Softw., 38(1):1:1-1:25, 2011. URL:
  6. M. Fonseca Faraj, A. van der Grinten, H. Meyerhenke, J. L. Träff, and C. Schulz. High-quality hierarchical process mapping. CoRR, abs/2001.07134, 2020. URL:
  7. C. M. Fiduccia and R. M. Mattheyses. A Linear-Time Heuristic for Improving Network Partitions. In Proc. of the 19th Conference on Design Automation, pages 175-181, 1982. Google Scholar
  8. M. R. Garey, D. S. Johnson, and L. Stockmeyer. Some Simplified NP-Complete Problems. In Proc. of the 6th ACM Symposium on Theory of Computing, (STOC), pages 47-63. ACM, 1974. Google Scholar
  9. R. Glantz, H. Meyerhenke, and A. Noe. Algorithms for mapping parallel processes onto grid and torus architectures. In 23rd Euromicro Intl. Conference on Parallel, Distributed, and Network-Based Processing, pages 236-243, 2015. Google Scholar
  10. R. Glantz, M. Predari, and H. Meyerhenke. Topology-induced enhancement of mappings. In Proceedings of the 47th International Conference on Parallel Processing, ICPP 2018, pages 9:1-9:10. ACM, 2018. URL:
  11. T. Hatazaki. Rank reordering strategy for MPI topology creation functions. In 5th European PVM/MPI User’s Group Meeting, volume 1497 of LNCS, pages 188-195, 1998. Google Scholar
  12. C. H. Heider. A computationally simplified pair-exchange algorithm for the quadratic assignment problem. Technical report, DTIC Document, 1972. Google Scholar
  13. B. Hendrickson and R. Leland. A Multilevel Algorithm for Partitioning Graphs. In Proc. of the ACM/IEEE Conference on Supercomputing'95. ACM, 1995. Google Scholar
  14. G. Karypis and V. Kumar. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing, 20(1):359-392, 1998. Google Scholar
  15. G. Mercier and J. Clet-Ortega. Towards an efficient process placement policy for MPI applications in multicore environments. In 16th European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting, volume 5759 of LNCS, pages 104-115. Springer, 2009. Google Scholar
  16. G. Mercier and E. Jeannot. Improving MPI applications performance on multicore clusters with rank reordering. In 18th European MPI Users' Group Meeting, volume 6960 of LNCS, pages 39-49. Springer, 2011. Google Scholar
  17. H. Meyerhenke, P. Sanders, and C. Schulz. Partitioning Complex Networks via Size-constrained Clustering. In 13th Int. Symp. on Exp. Algorithms, volume 8504 of LNCS. Springer, 2014. Google Scholar
  18. H. Müller-Merbach. Optimale reihenfolgen, volume 15 of Ökonometrie und Unternehmensforschung. Springer-Verlag, 1970. Google Scholar
  19. F. Pellegrini. Scotch Home Page. URL:
  20. François Pellegrini and Jean Roman. SCOTCH: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In High-Performance Computing and Networking, volume 1067 of Lecture Notes in Computer Science, pages 493-498. Springer, 1996. URL:
  21. U. N. Raghavan, R. Albert, and S. Kumara. Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3):036106, 2007. Google Scholar
  22. S. Sahni and T. F. Gonzalez. P-complete approximation problems. J. ACM, 23(3):555-565, 1976. URL:
  23. P. Sanders and C. Schulz. KaHIP - Karlsruhe High Qualtity Partitioning Homepage. URL:
  24. P. Sanders and C. Schulz. Engineering Multilevel Graph Partitioning Algorithms. In Proc. of the 19th European Symp. on Algorithms, volume 6942 of LNCS, pages 469-480. Springer, 2011. Google Scholar
  25. P. Sanders and C. Schulz. Think Locally, Act Globally: Highly Balanced Graph Partitioning. In 12th Intl. Sym. on Experimental Algorithms (SEA), LNCS. Springer, 2013. Google Scholar
  26. S. Schlag, V. Henne, T. Heuer, H. Meyerhenke, P. Sanders, and C. Schulz. k-way hypergraph partitioning via n-level recursive bisection. In Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments, ALENEX, pages 53-67, 2016. URL:
  27. C. Schulz. High Quality Graph Partititioning. PhD thesis, Karlsruhe Institute of Technology, 2013. Google Scholar
  28. C. Schulz and D. Strash. Graph partitioning: Formulations and applications to big data. In Sherif Sakr and Albert Y. Zomaya, editors, Encyclopedia of Big Data Technologies. Springer, 2019. URL:
  29. C. Schulz and J. L. Träff. Better process mapping and sparse quadratic assignment. In 16th International Symposium on Experimental Algorithms, volume 75 of LIPIcs, pages 4:1-4:15, 2017. URL:
  30. A. J. Soper, C. Walshaw, and M. Cross. A Combined Evolutionary Search and Multilevel Optimisation Approach to Graph-Partitioning. Global Optimization, 29(2):225-241, 2004. Google Scholar
  31. J. L. Träff. Implementing the MPI process topology mechanism. In ACM/IEEE Supercomputing, pages 40:1-40:14, 2002. Google Scholar
  32. Jesper Larsson Träff. Direct graph k-partitioning with a Kernighan-Lin like heuristic. Operations Research Letters, 34(6):621-629, 2006. Google Scholar
  33. R. Vamosi, M. Lassnig, and E. Schikuta. Data allocation based on evolutionary data popularity clustering. In Yong Shi, Haohuan Fu, Yingjie Tian, Valeria V. Krzhizhanovskaya, Michael Harold Lees, Jack J. Dongarra, and Peter M. A. Sloot, editors, Computational Science - ICCS 2018 - 18th International Conference, Wuxi, China, June 11-13, 2018, Proceedings, Part I, volume 10860 of Lecture Notes in Computer Science, pages 153-166. Springer, 2018. URL:
  34. J. T. Vogelstein, J. M. Conroy, V. Lyzinski, L. J. Podrazik, S. G. Kratzer, E. T. Harley, D. E. Fishkind, R. J. Vogelstein, and C. E. Priebe. Fast approximate quadratic programming for graph matching. PLOS One, April 2015. Google Scholar
  35. K. von Kirchbach, C. Schulz, and J. L. Träff. Better process mapping and sparse quadratic assignment. CoRR, 2019. URL:
  36. C. Walshaw and M. Cross. Mesh Partitioning: A Multilevel Balancing and Refinement Algorithm. SIAM Journal on Scientific Computing, 22(1):63-80, 2000. Google Scholar
  37. C. Walshaw and M. Cross. Multilevel mesh partitioning for heterogeneous communication networks. Future Generation Comp. Syst., 17(5):601-623, 2001. URL:
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail