Search Results

Documents authored by Jeanmougin, Louison


Document
Bounding the WCET of a GPU Thread Block with a Multi-Phase Representation of Warps Execution

Authors: Louison Jeanmougin, Thomas Carle, and Christine Rochange

Published in: LIPIcs, Volume 335, 37th Euromicro Conference on Real-Time Systems (ECRTS 2025)


Abstract
This paper proposes to model the Worst-Case Execution Time (WCET) of a GPU thread block as the Worst-Case Response Time (WCRT) of the warps composing the block. Inspired by the WCRT analyzes for classical CPU tasks, the response time of a warp is modeled as its execution time in isolation added to an interference term that accounts for the execution of higher priority warps. We provide an algorithm to build a representation of the execution of each warp of a thread block that distinguishes phases of execution on the functional units and phases of idleness due to operations latency. A simple formula relying on this model is then proposed to safely upper bound the WCRT of warps scheduled under greedy policies such as Greedy-Then-Oldest (GTO) or Loose Round-Robin (LRR). We experimented our approach using simulations of kernels from a GPU benchmark suite on the Accel-Sim simulator. We also evaluated the model on a GPU program that is likely to be found in safety critical systems : SGEMM (Single-precision GEneral Matrix Multiplication). This work constitutes a promising first building block of an analysis pipeline for enabling static WCET computation on GPUs.

Cite as

Louison Jeanmougin, Thomas Carle, and Christine Rochange. Bounding the WCET of a GPU Thread Block with a Multi-Phase Representation of Warps Execution. In 37th Euromicro Conference on Real-Time Systems (ECRTS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 335, pp. 11:1-11:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{jeanmougin_et_al:LIPIcs.ECRTS.2025.11,
  author =	{Jeanmougin, Louison and Carle, Thomas and Rochange, Christine},
  title =	{{Bounding the WCET of a GPU Thread Block with a Multi-Phase Representation of Warps Execution}},
  booktitle =	{37th Euromicro Conference on Real-Time Systems (ECRTS 2025)},
  pages =	{11:1--11:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-377-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{335},
  editor =	{Mancuso, Renato},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2025.11},
  URN =		{urn:nbn:de:0030-drops-235898},
  doi =		{10.4230/LIPIcs.ECRTS.2025.11},
  annote =	{Keywords: GPU, WCET analysis}
}
Document
Artifact
Bounding the WCET of a GPU Thread Block with a Multi-Phase Representation of Warps Execution (Artifact)

Authors: Louison Jeanmougin, Thomas Carle, and Christine Rochange

Published in: DARTS, Volume 11, Issue 1, Special Issue of the 37th Euromicro Conference on Real-Time Systems (ECRTS 2025)


Abstract
This paper proposes to model the Worst-Case Execution Time (WCET) of a GPU thread block as the Worst-Case Response Time (WCRT) of the warps composing the block. Inspired by the WCRT analyzes for classical CPU tasks, the response time of a warp is modeled as its execution time in isolation added to an interference term that accounts for the execution of higher priority warps. We provide an algorithm to build a representation of the execution of each warp of a thread block that distinguishes phases of execution on the functional units and phases of idleness due to operations latency. A simple formula relying on this model is then proposed to safely upper bound the WCRT of warps scheduled under greedy policies such as Greedy-Then-Oldest (GTO) or Loose Round-Robin (LRR). We experimented our approach using simulations of kernels from a GPU benchmark suite on the Accel-Sim simulator. We also evaluated the model on a GPU program that is likely to be found in safety critical systems : SGEMM (Single-precision GEneral Matrix Multiplication). This work constitutes a promising first building block of an analysis pipeline for enabling static WCET computation on GPUs.

Cite as

Louison Jeanmougin, Thomas Carle, and Christine Rochange. Bounding the WCET of a GPU Thread Block with a Multi-Phase Representation of Warps Execution (Artifact). In Special Issue of the 37th Euromicro Conference on Real-Time Systems (ECRTS 2025). Dagstuhl Artifacts Series (DARTS), Volume 11, Issue 1, pp. 3:1-3:5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{jeanmougin_et_al:DARTS.11.1.3,
  author =	{Jeanmougin, Louison and Carle, Thomas and Rochange, Christine},
  title =	{{Bounding the WCET of a GPU Thread Block with a Multi-Phase Representation of Warps Execution (Artifact)}},
  pages =	{3:1--3:5},
  journal =	{Dagstuhl Artifacts Series},
  ISSN =	{2509-8195},
  year =	{2025},
  volume =	{11},
  number =	{1},
  editor =	{Jeanmougin, Louison and Carle, Thomas and Rochange, Christine},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DARTS.11.1.3},
  URN =		{urn:nbn:de:0030-drops-236047},
  doi =		{10.4230/DARTS.11.1.3},
  annote =	{Keywords: GPU, WCET analysis}
}
Document
Warp-Level CFG Construction for GPU Kernel WCET Analysis

Authors: Louison Jeanmougin, Pascal Sotin, Christine Rochange, and Thomas Carle

Published in: OASIcs, Volume 114, 21th International Workshop on Worst-Case Execution Time Analysis (WCET 2023)


Abstract
We present an abstract interpretation technique to automatically build a Control Flow Graph (CFG) representation of the execution of a GPU kernel. GPUs implement an inherently parallel execution model, in which threads are grouped within so-called warps that execute in lockstep. This execution model enables the representation of the execution of the threads of a warp as a single CFG. However, thread divergence may appear within a warp and its effect must be captured explicitly within the CFG. Our method builds the CFG of a warp by applying abstract interpretation on the assembly (Nvidia SASS) code of a kernel, and by maintaining an abstract representation of which threads within the warp agree on which values. This allows the method to detect precisely the points in the program where thread divergence may occur, and avoid spurious reactivation edges in the CFG. We apply our technique on benchmark kernels as a proof-of-concept, and generate IPET systems using the resulting CFGs.

Cite as

Louison Jeanmougin, Pascal Sotin, Christine Rochange, and Thomas Carle. Warp-Level CFG Construction for GPU Kernel WCET Analysis. In 21th International Workshop on Worst-Case Execution Time Analysis (WCET 2023). Open Access Series in Informatics (OASIcs), Volume 114, pp. 1:1-1:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{jeanmougin_et_al:OASIcs.WCET.2023.1,
  author =	{Jeanmougin, Louison and Sotin, Pascal and Rochange, Christine and Carle, Thomas},
  title =	{{Warp-Level CFG Construction for GPU Kernel WCET Analysis}},
  booktitle =	{21th International Workshop on Worst-Case Execution Time Analysis (WCET 2023)},
  pages =	{1:1--1:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-293-8},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{114},
  editor =	{W\"{a}gemann, Peter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.WCET.2023.1},
  URN =		{urn:nbn:de:0030-drops-184303},
  doi =		{10.4230/OASIcs.WCET.2023.1},
  annote =	{Keywords: Graphical Processing Unit (GPU), Control Flow Graphs (CFG), Worst-Case Execution Time (WCET), Program analysis}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail