,
Omar Shaaban
,
Juliette Fournis d'Albiat
,
Isabel Piedrahita
Creative Commons Attribution 4.0 International license
This talk will present recent advances in extending OmpSs-2 to distributed-memory systems, highlighting three contributions and the associated challenges. OmpSs-2@Cluster employs a common address space and weak accesses to support concurrent task creation and dataflow execution across nodes. Achieving good performance and scalability on 16 to 32 nodes requires detailed performance analysis together with a set of optimizations and runtime techniques, which I will outline in the talk. Second, I will describe how task offloading, in combination with BSC’s Dynamic Load Balancing (DLB), enables OmpSs-2@Cluster to mitigate load imbalance in MPI + OmpSs-2 programs with minimal application changes. Third, I will explain how the runtime can exploit the iterative structure of certain task dependency graphs to precompute communications and execute iterative regions efficiently, yielding performance and scalability comparable to state-of-the-art asynchronous MPI+X. Together, these results indicate that distributed tasking can combine productivity, adaptability, and high performance in modern HPC applications.
@InProceedings{carpenter_et_al:OASIcs.PARMA-DITAM.2026.1,
author = {Carpenter, Paul and Shaaban, Omar and d'Albiat, Juliette Fournis and Piedrahita, Isabel},
title = {{Distributed Task Execution: Opportunities, Challenges and Lessons Learnt from OmpSs-2@Cluster}},
booktitle = {17th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 15th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2026)},
pages = {1:1--1:7},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-416-1},
ISSN = {2190-6807},
year = {2026},
volume = {141},
editor = {Baroffio, Davide and Busia, Paola and Denisov, Lev and Shukla, Nitin},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2026.1},
URN = {urn:nbn:de:0030-drops-256685},
doi = {10.4230/OASIcs.PARMA-DITAM.2026.1},
annote = {Keywords: Task-based programming, distributed-memory clusters, programming models, runtime systems, task scheduling, data dependency management, load balancing, asynchronous communication}
}