GCAPS: GPU Context-Aware Preemptive Priority-Based Scheduling for Real-Time Tasks

Wang, Yidi; Liu, Cong; Wong, Daniel; Kim, Hyoseung

doi:10.4230/LIPIcs.ECRTS.2024.14

File

Author Details

Yidi Wang

University of California, Riverside, CA, USA

Cong Liu

University of California, Riverside, CA, USA

Daniel Wong

University of California, Riverside, CA, USA

Hyoseung Kim

University of California, Riverside, CA, USA

Cite AsGet BibTex

Yidi Wang, Cong Liu, Daniel Wong, and Hyoseung Kim. GCAPS: GPU Context-Aware Preemptive Priority-Based Scheduling for Real-Time Tasks. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 14:1-14:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ECRTS.2024.14

Abstract

Scheduling real-time tasks that utilize GPUs with analyzable guarantees poses a significant challenge due to the intricate interaction between CPU and GPU resources, as well as the complex GPU hardware and software stack. While much research has been conducted in the real-time research community, several limitations persist, including the absence or limited availability of GPU-level preemption, extended blocking times, and/or the need for extensive modifications to program code. In this paper, we propose GCAPS, a GPU Context-Aware Preemptive Scheduling approach for real-time GPU tasks. Our approach exerts control over GPU context scheduling at the device driver level and enables preemption of GPU execution based on task priorities by simply adding one-line macros to GPU segment boundaries. In addition, we provide a comprehensive response time analysis of GPU-using tasks for both our proposed approach as well as the default Nvidia GPU driver scheduling that follows a work-conserving round-robin policy. Through empirical evaluations and case studies, we demonstrate the effectiveness of the proposed approaches in improving taskset schedulability and response time. The results highlight significant improvements over prior work as well as the default scheduling approach, with up to 40% higher schedulability, while also achieving predictable worst-case behavior on Nvidia Jetson embedded platforms.

Subject Classification

ACM Subject Classification

Computer systems organization → Real-time systems
Computer systems organization → Embedded and cyber-physical systems

Keywords

Real-time systems
GPU scheduling

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Nvidia CUDA samples. URL: https://github.com/NVIDIA/cuda-samples.
AnandTech. The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review. URL: https://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review.
Neil C. Audsley. Optimal priority assignment and feasibility of static priority tasks with arbitrary start times, 2007.
Joshua Bakita and James H. Anderson. Hardware Compute Partitioning on NVIDIA GPUs. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2023.
Joshua Bakita and James H. Anderson. Demystifying NVIDIA GPU Internals to Enable Reliable GPU Management. IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2024.
C. Basaran and K. Kang. Supporting preemptive task executions and memory copies in GPGPUs. In 2012 24th Euromicro Conference on Real-Time Systems, pages 287-296, 2012. URL: https://doi.org/10.1109/ECRTS.2012.15.
Marko Bertogna, Michele Cirinei, and Giuseppe Lipari. Schedulability analysis of global scheduling algorithms on multiprocessor platforms. IEEE Transactions on parallel and distributed systems, 20(4):553-566, 2008.
Enrico Bini and Giorgio C. Buttazzo. Measuring the performance of schedulability tests. Real-Time Syst., 30(1–2):129-154, May 2005. URL: https://doi.org/10.1007/s11241-005-0507-9.
Konstantinos Bletsas, Neil C. Audsley, Wen-Hung Huang, Jian-Jia Chen, and Geoffrey Nelissen. Errata for three papers (2004-05) on fixed-priority scheduling with self-suspensions. Leibniz Transactions on Embedded Systems, 5(1):02:1-02:20, May 2018. URL: https://doi.org/10.4230/LITES-v005-i001-a002.
Björn B Brandenburg. The FMLP+: An asymptotically optimal real-time locking protocol for suspension-aware analysis. In 2014 26th Euromicro Conference on Real-Time Systems, pages 61-71. IEEE, 2014.
Nicola Capodieci, Roberto Cavicchioli, Marko Bertogna, and Aingara Paramakuru. Deadline-based scheduling for GPU with preemption support. In 2018 IEEE Real-Time Systems Symposium (RTSS), pages 119-130. IEEE, 2018.
Glenn Elliott and James Anderson. Globally scheduled real-time multiprocessor systems with GPUs. Real-Time Systems, 48:34-74, May 2012. URL: https://doi.org/10.1007/s11241-011-9140-y.
Glenn Elliott and James Anderson. An optimal k-exclusion real-time locking protocol motivated by multi-GPU systems. Real-Time Systems, 49(2):140-170, 2013.
Glenn Elliott et al. GPUSync: A framework for real-time GPU management. In IEEE Real-Time Systems Symposium (RTSS), 2013.
Mingcong Han, Hanze Zhang, Rong Chen, and Haibo Chen. Microsecond-scale preemption for concurrent GPU-accelerated DNN inferences. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22), pages 539-558, Carlsbad, CA, July 2022. USENIX Association. URL: https://www.usenix.org/conference/osdi22/presentation/han.
S. Jain, I. Baek, S. Wang, and R. Rajkumar. Fractional GPUs: Software-based compute and memory bandwidth reservation for GPUs. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 29-41, 2019. URL: https://doi.org/10.1109/RTAS.2019.00011.
S. Kato, K. Lakshmanan, A. Kumar, M. Kelkar, Y. Ishikawa, and R. Rajkumar. RGEM: A responsive GPGPU execution model for runtime engines. In 2011 IEEE 32nd Real-Time Systems Symposium, pages 57-66, 2011. URL: https://doi.org/10.1109/RTSS.2011.13.
H. Kim, P. Patel, S. Wang, and R. R. Rajkumar. A server-based approach for predictable GPU access control. In 2017 IEEE 23rd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pages 1-10, 2017. URL: https://doi.org/10.1109/RTCSA.2017.8046309.
Jinkyu Lee. Improved schedulability analysis using carry-in limitation for non-preemptive fixed-priority multiprocessor scheduling. IEEE Transactions on Computers, 66(10):1816-1823, 2017.
Pratyush Patel, Iljoo Baek, Hyoseung Kim, and Ragunathan Rajkumar. Analytical enhancements and practical insights for MPCP with self-suspensions. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2018.
Ragunathan Rajkumar. Real-time synchronization protocols for shared memory multiprocessors. In Proceedings., 10th International Conference on Distributed Computing Systems, pages 116-117. IEEE Computer Society, 1990.
Steven Rostedt. Rt-mutex. https://docs.kernel.org/locking/rt-mutex-design.html, 2009.
S. Saha, Y. Xiang, and H. Kim. STGM: Spatio-temporal GPU management for real-time tasks. In 2019 IEEE 25th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pages 1-6, 2019. URL: https://doi.org/10.1109/RTCSA.2019.8864564.
I. Tanasic, I. Gelado, J. Cabezas, A. Ramirez, N. Navarro, and M. Valero. Enabling preemptive multiprogramming on GPUs. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), pages 193-204, 2014. URL: https://doi.org/10.1109/ISCA.2014.6853208.
Yidi Wang. rtenlab/gcaps-super-repo. Software (visited on 06/06/2024). URL: https://github.com/rtenlab/gcaps-super-repo.
Yidi Wang. Advancing Real-Time GPU Scheduling: Energy Efficiency and Preemption Strategies. PhD thesis, University of California, Riverside, 2023.
Yidi Wang, Mohsen Karimi, and Hyoseung Kim. Towards Energy-Efficient Real-Time Scheduling of Heterogeneous Multi-GPU Systems. In 2022 IEEE Real-Time Systems Symposium (RTSS), pages 409-421. IEEE, 2022.
Yidi Wang, Mohsen Karimi, Yecheng Xiang, and Hyoseung Kim. Balancing energy efficiency and real-time performance in GPU scheduling. In 2021 IEEE Real-Time Systems Symposium (RTSS), pages 110-122. IEEE, 2021.
Yidi Wang, Cong Liu, Daniel Wong, and Hyoseung Kim. Unleashing the power of preemptive priority-based scheduling for real-time gpu tasks, 2024. URL: https://arxiv.org/abs/2401.16529.
Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, and Jeffrey Vetter. Enabling and exploiting flexible task assignment on GPU through SM-centric program transformations. In Proceedings of the 29th ACM on International Conference on Supercomputing, pages 119-130, 2015.
Yecheng Xiang and Hyoseung Kim. Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference. In 2019 IEEE Real-Time Systems Symposium (RTSS), pages 392-405. IEEE, 2019.
H. Zhou, G. Tong, and C. Liu. GPES: a preemptive execution system for GPGPU computing. In 21st IEEE Real-Time and Embedded Technology and Applications Symposium, pages 87-97, 2015. URL: https://doi.org/10.1109/RTAS.2015.7108420.
An Zou, Jing Li, Christopher D Gill, and Xuan Zhang. RTGPU: Real-time GPU scheduling of hard deadline parallel tasks with fine-grain utilization. IEEE Transactions on Parallel and Distributed Systems, 2023.

GCAPS: GPU Context-Aware Preemptive Priority-Based Scheduling for Real-Time Tasks

Authors Yidi Wang, Cong Liu, Daniel Wong, Hyoseung Kim

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

GCAPS: GPU Context-Aware Preemptive Priority-Based Scheduling for Real-Time Tasks

Authors Yidi Wang, Cong Liu, Daniel Wong, Hyoseung Kim

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message