Employing MPI Collectives for Timing Analysis on Embedded Multi-Cores

Authors Martin Frieb, Alexander Stegmeier, Jörg Mische, Theo Ungerer



PDF
Thumbnail PDF

File

OASIcs.WCET.2016.10.pdf
  • Filesize: 480 kB
  • 11 pages

Document Identifiers

Author Details

Martin Frieb
Alexander Stegmeier
Jörg Mische
Theo Ungerer

Cite AsGet BibTex

Martin Frieb, Alexander Stegmeier, Jörg Mische, and Theo Ungerer. Employing MPI Collectives for Timing Analysis on Embedded Multi-Cores. In 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016). Open Access Series in Informatics (OASIcs), Volume 55, pp. 10:1-10:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/OASIcs.WCET.2016.10

Abstract

Static WCET analysis of parallel programs running on shared-memory multicores suffers from high pessimism. Instead, distributed memory platforms which communicate via messages may be one solution for manycore systems. Message Passing Interface (MPI) is a standard for communication on these platforms. We show how its concept of collective operations can be employed for timing analysis. The idea is that the worst-case execution time (WCET) of a parallel program may be estimated by adding the WCET estimates of sequential program parts to the WCET estimates of communication parts. Therefore, we first analyse the two MPI operations MPI_Allreduce and MPI_Sendrecv. Employing these results, we make a timing analysis of the conjugate gradient (CG) benchmark from the NAS parallel benchmark suite.
Keywords
  • Real Time
  • Network on Chip
  • WCET
  • Timing Analysis
  • MPI

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS parallel benchmarks. International Journal of High Performance Computing Applications, 5(3):63-73, 1991. URL: http://dx.doi.org/10.1177/109434209100500306.
  2. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS Parallel Benchmarks - Summary and Preliminary Results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, Supercomputing'91, pages 158-165, New York, NY, USA, 1991. ACM. URL: http://dx.doi.org/10.1145/125826.125925.
  3. C. Ballabriga, H. Cassé, C. Rochange, and P. Sainrat. OTAWA: An Open Toolbox for Adaptive WCET Analysis. In Software Technologies for Embedded and Ubiquitous Systems, volume 6399 of LNCS, pages 35-46. Springer Berlin Heidelberg, 2011. URL: http://dx.doi.org/10.1007/978-3-642-16256-5_6.
  4. Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Version 3.1. High Performance Computing Center Stuttgart (HLRS), 2015. Google Scholar
  5. M. Frieb, R. Jahr, H. Ozaktas, A. Hugl, H. Regler, and T. Ungerer. A parallelization approach for hard real-time systems and its application on two industrial programs. International Journal for Parallel Programming, 2016. URL: http://dx.doi.org/10.1007/s10766-016-0432-7.
  6. A. Kanevsky, A. Skjellum, and A. Rounbehler. MPI/RT - an emerging standard for high-performance real-time systems. In Proceedings of the Thirty-First Hawaii International Conference on System Sciences, volume 3, pages 157-166, 1998. URL: http://dx.doi.org/10.1109/HICSS.1998.656130.
  7. E. Kasapaki, M. Schoeberl, R. B. Sørensen, C. Müller, K. Goossens, and J. Sparsø. Argo: A Real-Time Network-on-Chip Architecture With an Efficient GALS Implementation. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 24(2):479-492, Feb 2016. URL: http://dx.doi.org/10.1109/TVLSI.2015.2405614.
  8. B. Lisper. Towards Parallel Programming Models for Predictability. In 12th International Workshop on Worst-Case Execution Time Analysis, volume 23, pages 48-58, Dagstuhl, Germany, 2012. URL: http://dx.doi.org/10.4230/OASIcs.WCET.2012.48.
  9. S. Metzlaff, J. Mische, and T. Ungerer. A real-time capable many-core model. In 32nd IEEE Real-Time Systems Symposium: WiP Session, pages 21-24, Vienna, Austria, 2011. URL: http://www.cs.wayne.edu/~fishern/Meetings/wip-rtss2011/WiP-RTSS-2011-Proceedings-Post.pdf.
  10. J. Mische and T. Ungerer. Low power flitwise routing in an unidirectional torus with minimal buffering. In Fifth International Workshop on Network on Chip Architectures, NoCArc'12, pages 63-68, New York, NY, USA, 2012. ACM. URL: http://dx.doi.org/10.1145/2401716.2401730.
  11. J. Mische and T. Ungerer. Guaranteed service independent of the task placement in nocs with torus topology. In 22nd International Conference on Real-Time Networks and Systems, RTNS'14, pages 151:151-151:160, New York, NY, USA, 2014. ACM. URL: http://dx.doi.org/10.1145/2659787.2659804.
  12. C. Rochange, A. Bonenfant, P. Sainrat, M. Gerdes, J. Wolf, T. Ungerer, Z. Petrov, and F. Mikulu. WCET Analysis of a Parallel 3D Multigrid Solver Executed on the MERASA Multi-Core. In 10th International Workshop on Worst-Case Execution Time Analysis (WCET 2010), volume 15, pages 90-100, Dagstuhl, Germany, 2010. URL: http://dx.doi.org/10.4230/OASIcs.WCET.2010.90.
  13. M. Schoeberl, F. Brandner, J. Sparsø, and E. Kasapaki. A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems. In Sixth IEEE/ACM International Symposium on Networks on Chip (NoCS), pages 152-160, May 2012. URL: http://dx.doi.org/10.1109/NOCS.2012.25.
  14. A. Skjellum, A. Kanevsky, Y. Dandass, J. Watts, S. Paavola, D. Cottel, G. Henley, L. S. Hebert, Z. Cui, and A. Rounbehler. The Real-Time Message Passing Interface Standard (MPI/RT-1.1). Concurrency and Computation: Practice and Experience, 16(S1):i-322, 2004. URL: http://dx.doi.org/10.1002/cpe.744.
  15. A. Stegmeier, M. Frieb, J. Mische, and T. Ungerer. WCTT bounds for MPI Collectives in the Paternoster NoC. In 14th International Workshop on Real-Time Networks (RTN), Toulouse, France, 2016. Google Scholar
  16. R. B. Sørensen, W. Puffitsch, M. Schoeberl, and J. Sparsø. Message passing on a time-predictable multicore processor. In IEEE 18th International Symposium on Real-Time Distributed Computing, pages 51-59, April 2015. URL: http://dx.doi.org/10.1109/ISORC.2015.15.