Search Results

Documents authored by Wagle, Rohan


Document
Autonomy Today: Many Delay-Prone Black Boxes

Authors: Sizhe Liu, Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, and Yunhua Li

Published in: LIPIcs, Volume 298, 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)


Abstract
Machine-learning (ML) technology has been a key enabler in the push towards realizing ever more sophisticated autonomous-driving features. In deploying such technology, the automotive industry has relied heavily on using "black-box" software and hardware components that were originally intended for non-safety-critical contexts, without a full understanding of their real-time capabilities. A prime example of such a component is CUDA, which is fundamental to the acceleration of ML algorithms using NVIDIA GPUs. In this paper, evidence is presented demonstrating that CUDA can cause unbounded task delays. Such delays are the result of CUDA’s usage of synchronization mechanisms in the POSIX thread (pthread) library, so the latter is implicated as a delay-prone component as well. Such synchronization delays are shown to be the source of a system failure that occurred in an actual autonomous vehicle system during testing at WeRide. Motivated by these findings, a broader experimental study is presented that demonstrates several real-time deficiencies in CUDA, the glibc pthread library, Linux, and the POSIX interface of the safety-certified QNX Operating System for Safety. Partial mitigations for these deficiencies are presented and further actions are proposed for real-time researchers and developers to integrate more complete mitigations.

Cite as

Sizhe Liu, Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, and Yunhua Li. Autonomy Today: Many Delay-Prone Black Boxes. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 12:1-12:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{liu_et_al:LIPIcs.ECRTS.2024.12,
  author =	{Liu, Sizhe and Wagle, Rohan and Anderson, James H. and Yang, Ming and Zhang, Chi and Li, Yunhua},
  title =	{{Autonomy Today: Many Delay-Prone Black Boxes}},
  booktitle =	{36th Euromicro Conference on Real-Time Systems (ECRTS 2024)},
  pages =	{12:1--12:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-324-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{298},
  editor =	{Pellizzoni, Rodolfo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2024.12},
  URN =		{urn:nbn:de:0030-drops-203152},
  doi =		{10.4230/LIPIcs.ECRTS.2024.12},
  annote =	{Keywords: autonomous driving, CUDA programming, locking protocols, POSIX thread, operating systems, machine learning systems, real-time systems}
}
Document
Artifact
Autonomy Today: Many Delay-Prone Black Boxes (Artifact)

Authors: Sizhe Liu, Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, and Yunhua Li

Published in: DARTS, Volume 10, Issue 1, Special Issue of the 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)


Abstract
Machine-learning (ML) technology has been a key enabler in the push towards realizing ever more sophisticated autonomous-driving features. In deploying such technology, the automotive industry has relied heavily on using "black-box" software and hardware components that were originally intended for non-safety-critical contexts, without a full understanding of their real-time capabilities. A prime example of such a component is CUDA, which is fundamental to the acceleration of ML algorithms using NVIDIA GPUs. In this paper, evidence is presented demonstrating that CUDA can cause unbounded task delays. Such delays are the result of CUDA’s usage of synchronization mechanisms in the POSIX thread (pthread) library, so the latter is implicated as a delay-prone component as well. Such synchronization delays are shown to be the source of a system failure that occurred in an actual autonomous vehicle system during testing at WeRide. Motivated by these findings, a broader experimental study is presented that demonstrates several real-time deficiencies in CUDA, the glibc pthread library, Linux, and the POSIX interface of the safety-certified QNX Operating System for Safety. Partial mitigations for these deficiencies are presented and further actions are proposed for real-time researchers and developers to integrate more complete mitigations.

Cite as

Sizhe Liu, Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, and Yunhua Li. Autonomy Today: Many Delay-Prone Black Boxes (Artifact). In Special Issue of the 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Dagstuhl Artifacts Series (DARTS), Volume 10, Issue 1, pp. 3:1-3:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{liu_et_al:DARTS.10.1.3,
  author =	{Liu, Sizhe and Wagle, Rohan and Anderson, James H. and Yang, Ming and Zhang, Chi and Li, Yunhua},
  title =	{{Autonomy Today: Many Delay-Prone Black Boxes (Artifact)}},
  pages =	{3:1--3:3},
  journal =	{Dagstuhl Artifacts Series},
  ISBN =	{978-3-95977-327-0},
  ISSN =	{2509-8195},
  year =	{2024},
  volume =	{10},
  number =	{1},
  editor =	{Liu, Sizhe and Wagle, Rohan and Anderson, James H. and Yang, Ming and Zhang, Chi and Li, Yunhua},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DARTS.10.1.3},
  URN =		{urn:nbn:de:0030-drops-203259},
  doi =		{10.4230/DARTS.10.1.3},
  annote =	{Keywords: autonomous driving, CUDA programming, locking protocols, POSIX thread, operating systems, machine learning systems, real-time systems}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail