HPC Application Cloudification: The StreamFlow Toolkit (Invited Paper)

Authors Iacopo Colonnelli , Barbara Cantalupo , Roberto Esposito , Matteo Pennisi , Concetto Spampinato , Marco Aldinucci



PDF
Thumbnail PDF

File

OASIcs.PARMA-DITAM.2021.5.pdf
  • Filesize: 3.72 MB
  • 13 pages

Document Identifiers

Author Details

Iacopo Colonnelli
  • Computer Science Department, University of Torino, Italy
Barbara Cantalupo
  • Computer Science Department, University of Torino, Italy
Roberto Esposito
  • Computer Science Department, University of Torino, Italy
Matteo Pennisi
  • Electrical Engineering Department, University of Catania, Italy
Concetto Spampinato
  • Electrical Engineering Department, University of Catania, Italy
Marco Aldinucci
  • Department of Computer Science, University of Pisa, Italy

Acknowledgements

We want to thank Emanuela Girardi and Gianluca Bontempi who are coordinating the CLAIRE task force on COVID-19 for their support, and the group of volunteer researchers who contributed to the development of CLAIRE COVID-19 universal pipeline, they are: Marco Calandri and Piero Fariselli (Radiomics & medical science, University of Torino, Italy); Marco Grangetto, Enzo Tartaglione (Digital image processing Lab, University of Torino, Italy); Simone Palazzo, Isaak Kavasidis (PeRCeiVe Lab, University of Catania, Italy); Bogdan Ionescu, Gabriel Constantin (Multimedia Lab @ CAMPUS Research Institute, University Politechnica of Bucharest, Romania); Miquel Perello Nieto (Computer Science, University of Bristol, UK); Inês Domingues (School of Sciences University of Porto, Portugal).

Cite AsGet BibTex

Iacopo Colonnelli, Barbara Cantalupo, Roberto Esposito, Matteo Pennisi, Concetto Spampinato, and Marco Aldinucci. HPC Application Cloudification: The StreamFlow Toolkit (Invited Paper). In 12th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and 10th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2021). Open Access Series in Informatics (OASIcs), Volume 88, pp. 5:1-5:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/OASIcs.PARMA-DITAM.2021.5

Abstract

Finding an effective way to improve accessibility to High-Performance Computing facilities, still anchored to SSH-based remote shells and queue-based job submission mechanisms, is an open problem in computer science. This work advocates a cloudification of HPC applications through a cluster-as-accelerator pattern, where computationally demanding portions of the main execution flow hosted on a Cloud Finding an effective way to improve accessibility to High-Performance Computing facilities, still anchored to SSH-based remote shells and queue-based job submission mechanisms, is an open problem in computer science. This work advocates a cloudification of HPC applications through a cluster-as-accelerator pattern, where computationally demanding portions of the main execution flow hosted on a Cloud infrastructure can be offloaded to HPC environments to speed them up. We introduce StreamFlow, a novel Workflow Management System that supports such a design pattern and makes it possible to run the steps of a standard workflow model on independent processing elements with no shared storage. We validated the proposed approach’s effectiveness on the CLAIRE COVID-19 universal pipeline, i.e. a reproducible workflow capable of automating the comparison of (possibly all) state-of-the-art pipelines for the diagnosis of COVID-19 interstitial pneumonia from CT scans images based on Deep Neural Networks (DNNs).

Subject Classification

ACM Subject Classification
  • Computer systems organization → Cloud computing
  • Computing methodologies → Distributed computing methodologies
Keywords
  • cloud computing
  • distributed computing
  • high-performance computing
  • streamflow
  • workflow management systems

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Cech, John Chilton, Dave Clements, Nate Coraor, Björn A. Grüning, Aysam Guerler, Jennifer Hillman-Jackson, Saskia D. Hiltemann, Vahid Jalili, Helena Rasche, Nicola Soranzo, Jeremy Goecks, James Taylor, Anton Nekrutenko, and Daniel J. Blankenberg. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res., 46(Webserver-Issue):W537-W544, 2018. URL: https://doi.org/10.1093/nar/gky379.
  2. Michael Albrecht, Patrick Donnelly, Peter Bui, and Douglas Thain. Makeflow: a portable abstraction for data intensive computing on clusters, clouds, and grids. In Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, SWEET@SIGMOD 2012, Scottsdale, AZ, USA, May 20, 2012, page 1, 2012. URL: https://doi.org/10.1145/2443416.2443417.
  3. Marco Aldinucci. High-performance computing and AI team up for COVID-19 diagnostic imaging. https://aihub.org/2021/01/12/high-performance-computing-and-ai-team-up-for-covid-19-diagnostic-imaging/, January 2021. Accessed: 2021-01-25.
  4. Marco Aldinucci, Marco Danelutto, Lorenzo Anardu, Massimo Torquati, and Peter Kilpatrick. Parallel patterns + macro data flow for multi-core programming. In Proc. of Intl. Euromicro PDP 2012: Parallel Distributed and network-based Processing, pages 27-36, Garching, Germany, February 2012. IEEE. URL: https://doi.org/10.1109/PDP.2012.44.
  5. Marco Aldinucci, Sergio Rabellino, Marco Pironti, Filippo Spiga, Paolo Viviani, Maurizio Drocco, Marco Guerzoni, Guido Boella, Marco Mellia, Paolo Margara, Idillio Drago, Roberto Marturano, Guido Marchetto, Elio Piccolo, Stefano Bagnasco, Stefano Lusso, Sara Vallero, Giuseppe Attardi, Alex Barchiesi, Alberto Colla, and Fulvio Galeazzi. HPC4AI, an AI-on-demand federated platform endeavour. In ACM Computing Frontiers, Ischia, Italy, May 2018. URL: https://doi.org/10.1145/3203217.3205340.
  6. Peter Amstutz, Michael R. Crusoe, Nebojša Tijanić, Brad Chapman, John Chilton, Michael Heuer, Andrey Kartashov, John Kern, Dan Leehr, Hervé Ménager, Maya Nedeljkovich, Matt Scales, Stian Soiland-Reyes, and Luka Stojanovic. Common workflow language, v1.0, 2016. URL: https://doi.org/10.6084/m9.figshare.3115156.v2.
  7. Michael R. Berthold, Nicolas Cebron, Fabian Dill, Thomas R. Gabriel, Tobias Kötter, Thorsten Meinl, Peter Ohl, Christoph Sieb, Kilian Thiel, and Bernd Wiswedel. KNIME: the Konstanz Information Miner. In Data Analysis, Machine Learning and Applications - Proceedings of the 31st Annual Conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universität Freiburg, March 7-9, 2007, Studies in Classification, Data Analysis, and Knowledge Organization, pages 319-326. Springer, 2007. URL: https://doi.org/10.1007/978-3-540-78246-9_38.
  8. Iacopo Colonnelli, Barbara Cantalupo, Ivan Merelli, and Marco Aldinucci. StreamFlow: cross-breeding cloud with HPC. IEEE Transactions on Emerging Topics in Computing, August 2020. URL: https://doi.org/10.1109/TETC.2020.3019202.
  9. Rafael Ferreira da Silva, Rosa Filgueira, Ilia Pietri, Ming Jiang, Rizos Sakellariou, and Ewa Deelman. A characterization of workflow management systems for extreme-scale applications. Future Gener. Comput. Syst., 75:228-238, 2017. URL: https://doi.org/10.1016/j.future.2017.02.026.
  10. Ewa Deelman, Karan Vahi, Gideon Juve, Mats Rynge, Scott Callaghan, Philip Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira da Silva, Miron Livny, and R. Kent Wenger. Pegasus, a workflow management system for science automation. Future Generation Comp. Syst., 46:17-35, 2015. URL: https://doi.org/10.1016/j.future.2014.10.008.
  11. Paolo Di Tommaso, Maria Chatzou, Evan W. Floden, Pablo P. Barja, Emilio Palumbo, and Cedric Notredame. Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4):316-319, April 2017. URL: https://doi.org/10.1038/nbt.3820.
  12. Maurizio Drocco, Claudia Misale, and Marco Aldinucci. A cluster-as-accelerator approach for SPMD-free data parallelism. In Proc. of 24th Euromicro Intl. Conference on Parallel Distributed and network-based Processing (PDP), pages 350-353, Crete, Greece, 2016. IEEE. URL: https://doi.org/10.1109/PDP.2016.97.
  13. Thomas Fahringer, Radu Prodan, Rubing Duan, Jürgen Hofer, Farrukh Nadeem, Francesco Nerieri, Stefan Podlipnig, Jun Qin, Mumtaz Siddiqui, Hong Linh Truong, Alex Villazón, and Marek Wieczorek. ASKALON: A development and grid computing environment for scientific workflows. In Workflows for e-Science, Scientific Workflows for Grids, pages 450-471. Springer, 2007. URL: https://doi.org/10.1007/978-1-84628-757-2_27.
  14. Johannes Köster and Sven Rahmann. Snakemake - a scalable bioinformatics workflow engine. Bioinformatics, 28(19):2520-2522, 2012. URL: https://doi.org/10.1093/bioinformatics/bts480.
  15. Michael Kotliar, Andrey V Kartashov, and Artem Barski. CWL-Airflow: a lightweight pipeline manager supporting Common Workflow Language. GigaScience, 8(7), July 2019. URL: https://doi.org/10.1093/gigascience/giz084.
  16. E.A. Lee and T.M. Parks. Dataflow process networks. Proc. of the IEEE, 83(5):773-801, May 1995. URL: https://doi.org/10.1109/5.381846.
  17. Bertram Ludäscher, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew B. Jones, Edward A. Lee, Jing Tao, and Yang Zhao. Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience, 18(10):1039-1065, 2006. URL: https://doi.org/10.1002/cpe.994.
  18. Claudia Misale, Maurizio Drocco, Marco Aldinucci, and Guy Tremblay. A comparison of big data frameworks on a layered dataflow model. Parallel Processing Letters, 27(01):1-20, 2017. URL: https://doi.org/10.1142/S0129626417400035.
  19. Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. Ray: A distributed framework for emerging AI applications. In 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8-10, 2018, pages 561-577, 2018. Google Scholar
  20. Enric Tejedor, Yolanda Becerra, Guillem Alomar, Anna Queralt, Rosa M. Badia, Jordi Torres, Toni Cortes, and Jesús Labarta. PyCOMPSs: Parallel computational workflows in Python. Int. J. High Perform. Comput. Appl., 31(1):66-82, 2017. URL: https://doi.org/10.1177/1094342015594678.
  21. John Vivian, Arjun A. Rao, Frank A. Nothaft, et al. Toil enables reproducible, open source, big biomedical data analyses. Nature Biotechnology, 35(4):314-316, April 2017. URL: https://doi.org/10.1038/nbt.3772.
  22. Justin M. Wozniak, Michael Wilde, and Ian T. Foster. Language features for scalable distributed-memory dataflow computing. In Proceedings of the 2014 Fourth Workshop on Data-Flow Execution Models for Extreme Scale Computing, DFM ’14, page 50–53, USA, 2014. IEEE Computer Society. URL: https://doi.org/10.1109/DFM.2014.17.
  23. Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. Apache Spark: a unified engine for big data processing. Commun. ACM, 59(11):56-65, 2016. URL: https://doi.org/10.1145/2934664.