Autonomy Today: Many Delay-Prone Black Boxes (Artifact)

Authors Sizhe Liu , Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, Yunhua Li



PDF
Thumbnail PDF

Artifact Description

DARTS.10.1.3.pdf
  • Filesize: 0.53 MB
  • 3 pages

Document Identifiers

Author Details

Sizhe Liu
  • University of North Carolina at Chapel Hill, NC, USA
Rohan Wagle
  • University of North Carolina at Chapel Hill, NC, USA
James H. Anderson
  • University of North Carolina at Chapel Hill, NC, USA
Ming Yang
  • WeRide Corp., San Jose, CA, USA
Chi Zhang
  • WeRide Corp., San Jose, CA, USA
Yunhua Li
  • WeRide Corp., San Jose, CA, USA

Cite AsGet BibTex

Sizhe Liu, Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, and Yunhua Li. Autonomy Today: Many Delay-Prone Black Boxes (Artifact). In Special Issue of the 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Dagstuhl Artifacts Series (DARTS), Volume 10, Issue 1, pp. 3:1-3:3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/DARTS.10.1.3

Artifact

Abstract

Machine-learning (ML) technology has been a key enabler in the push towards realizing ever more sophisticated autonomous-driving features. In deploying such technology, the automotive industry has relied heavily on using "black-box" software and hardware components that were originally intended for non-safety-critical contexts, without a full understanding of their real-time capabilities. A prime example of such a component is CUDA, which is fundamental to the acceleration of ML algorithms using NVIDIA GPUs. In this paper, evidence is presented demonstrating that CUDA can cause unbounded task delays. Such delays are the result of CUDA’s usage of synchronization mechanisms in the POSIX thread (pthread) library, so the latter is implicated as a delay-prone component as well. Such synchronization delays are shown to be the source of a system failure that occurred in an actual autonomous vehicle system during testing at WeRide. Motivated by these findings, a broader experimental study is presented that demonstrates several real-time deficiencies in CUDA, the glibc pthread library, Linux, and the POSIX interface of the safety-certified QNX Operating System for Safety. Partial mitigations for these deficiencies are presented and further actions are proposed for real-time researchers and developers to integrate more complete mitigations.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time operating systems
  • Software and its engineering → Process synchronization
Keywords
  • autonomous driving
  • CUDA programming
  • locking protocols
  • POSIX thread
  • operating systems
  • machine learning systems
  • real-time systems

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Sizhe Liu, Rohan Wagle, James H. Anderson, Ming Yang, Chi Zhang, and Yunhua Li. Autonomy today: Many delay-prone black boxes. In Proceedings of the 36th Euromicro Conference on Real-Time Systems, volume 298 of Leibniz International Proceedings in Informatics (LIPIcs), pages 12:1-12:27. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024. URL: https://doi.org/10.4230/LIPIcs.ECRTS.2024.12.