On Static Timing Analysis of GPU Kernels

Hirvisalo, Vesa

doi:10.4230/OASIcs.WCET.2014.43

File

OASIcs.WCET.2014.43.pdf

Filesize: 421 kB
10 pages

Document Identifiers

DOI: 10.4230/OASIcs.WCET.2014.43
URN: urn:nbn:de:0030-drops-46033

Author Details

Vesa Hirvisalo

Cite AsGet BibTex

Vesa Hirvisalo. On Static Timing Analysis of GPU Kernels. In 14th International Workshop on Worst-Case Execution Time Analysis. Open Access Series in Informatics (OASIcs), Volume 39, pp. 43-52, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)
https://doi.org/10.4230/OASIcs.WCET.2014.43

Abstract

We study static timing analysis of programs running on GPU accelerators. Such programs follow a data parallel programming model that allows massive parallelism on manycore processors. Data parallel programming and GPUs as accelerators have received wide use during the recent years. The timing analysis of programs running on single core machines is well known and applied also in practice. However for multicore and manycore machines, timing analysis presents a significant but yet not properly solved problem. In this paper, we present static timing analysis of GPU kernels based on a method that we call abstract CTA simulation. Cooperative Thread Arrays (CTA) are the basic execution structure that GPU devices use in their operation that proceeds in thread groups called warps. Abstract CTA simulation is based on static analysis of thread divergence in warps and their abstract scheduling.

Keywords

Parallelism
WCET

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

A. Betts and A. F. Donaldson. Estimating the WCET of GPU-Accelerated Applications Using Hybrid Analysis. In Proceedings of the Euromicro Conference on Real-Time Systems (ECTRS), pages 193-202, 2012.
S. Chattopadhyay, L. K. Chong, A. Roychoudhury, T. Kelter, P. Marwedel, and H. Falk. A Unified WCET Analysis Framework for Multi-core Platforms. ACM Transactions on Embedded Computing Systems (TECS), 13(4s), April 2014.
B. Coutinho, D. Sampaio, F. M. Q. Pereira, and W. Jr. Meira. Divergence Analysis and Optimizations. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT), pages 320-329, 2011.
A. E. Dalsgaard, M. C. Olesen, M. Toft, R. R. Hansen, and K. G. Larsen. METAMOC: Modular Execution Time Analysis using Model Checking. In Proceedings of the International Workshop on Worst-Case Execution Time Analysis (WCET), pages 114-124, 2010.
A. Gustavsson, J. Gustafsson, and B. Lisper. Timing Analysis of Parallel Software Using Abstract Execution. In Proceedings of International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI), pages 59-77, 2014.
Khronos. OpenCL documentation. URL: http://www.khronos.org/opencl/.
NVIDIA. CUDA documentation. URL: http://nvidia.com/.
R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. Ferdinand, R. Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and P. Stenström. The worst-case execution-time problem - overview of methods and survey of tools. ACM Transactions on Embedded Computing Systems (TECS), 7(3):1-53, April 2008.