GCAPS: GPU Context-Aware Preemptive Priority-Based Scheduling for Real-Time Tasks

Authors Yidi Wang, Cong Liu, Daniel Wong, Hyoseung Kim

Thumbnail PDF


  • Filesize: 1.82 MB
  • 25 pages

Document Identifiers

Author Details

Yidi Wang
  • University of California, Riverside, CA, USA
Cong Liu
  • University of California, Riverside, CA, USA
Daniel Wong
  • University of California, Riverside, CA, USA
Hyoseung Kim
  • University of California, Riverside, CA, USA

Cite AsGet BibTex

Yidi Wang, Cong Liu, Daniel Wong, and Hyoseung Kim. GCAPS: GPU Context-Aware Preemptive Priority-Based Scheduling for Real-Time Tasks. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 14:1-14:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Scheduling real-time tasks that utilize GPUs with analyzable guarantees poses a significant challenge due to the intricate interaction between CPU and GPU resources, as well as the complex GPU hardware and software stack. While much research has been conducted in the real-time research community, several limitations persist, including the absence or limited availability of GPU-level preemption, extended blocking times, and/or the need for extensive modifications to program code. In this paper, we propose GCAPS, a GPU Context-Aware Preemptive Scheduling approach for real-time GPU tasks. Our approach exerts control over GPU context scheduling at the device driver level and enables preemption of GPU execution based on task priorities by simply adding one-line macros to GPU segment boundaries. In addition, we provide a comprehensive response time analysis of GPU-using tasks for both our proposed approach as well as the default Nvidia GPU driver scheduling that follows a work-conserving round-robin policy. Through empirical evaluations and case studies, we demonstrate the effectiveness of the proposed approaches in improving taskset schedulability and response time. The results highlight significant improvements over prior work as well as the default scheduling approach, with up to 40% higher schedulability, while also achieving predictable worst-case behavior on Nvidia Jetson embedded platforms.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time systems
  • Computer systems organization → Embedded and cyber-physical systems
  • Real-time systems
  • GPU scheduling


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Nvidia CUDA samples. URL: https://github.com/NVIDIA/cuda-samples.
  2. AnandTech. The NVIDIA GeForce GTX 1080 & GTX 1070 Founders Editions Review. URL: https://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review.
  3. Neil C. Audsley. Optimal priority assignment and feasibility of static priority tasks with arbitrary start times, 2007. Google Scholar
  4. Joshua Bakita and James H. Anderson. Hardware Compute Partitioning on NVIDIA GPUs. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2023. Google Scholar
  5. Joshua Bakita and James H. Anderson. Demystifying NVIDIA GPU Internals to Enable Reliable GPU Management. IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2024. Google Scholar
  6. C. Basaran and K. Kang. Supporting preemptive task executions and memory copies in GPGPUs. In 2012 24th Euromicro Conference on Real-Time Systems, pages 287-296, 2012. URL: https://doi.org/10.1109/ECRTS.2012.15.
  7. Marko Bertogna, Michele Cirinei, and Giuseppe Lipari. Schedulability analysis of global scheduling algorithms on multiprocessor platforms. IEEE Transactions on parallel and distributed systems, 20(4):553-566, 2008. Google Scholar
  8. Enrico Bini and Giorgio C. Buttazzo. Measuring the performance of schedulability tests. Real-Time Syst., 30(1–2):129-154, May 2005. URL: https://doi.org/10.1007/s11241-005-0507-9.
  9. Konstantinos Bletsas, Neil C. Audsley, Wen-Hung Huang, Jian-Jia Chen, and Geoffrey Nelissen. Errata for three papers (2004-05) on fixed-priority scheduling with self-suspensions. Leibniz Transactions on Embedded Systems, 5(1):02:1-02:20, May 2018. URL: https://doi.org/10.4230/LITES-v005-i001-a002.
  10. Björn B Brandenburg. The FMLP+: An asymptotically optimal real-time locking protocol for suspension-aware analysis. In 2014 26th Euromicro Conference on Real-Time Systems, pages 61-71. IEEE, 2014. Google Scholar
  11. Nicola Capodieci, Roberto Cavicchioli, Marko Bertogna, and Aingara Paramakuru. Deadline-based scheduling for GPU with preemption support. In 2018 IEEE Real-Time Systems Symposium (RTSS), pages 119-130. IEEE, 2018. Google Scholar
  12. Glenn Elliott and James Anderson. Globally scheduled real-time multiprocessor systems with GPUs. Real-Time Systems, 48:34-74, May 2012. URL: https://doi.org/10.1007/s11241-011-9140-y.
  13. Glenn Elliott and James Anderson. An optimal k-exclusion real-time locking protocol motivated by multi-GPU systems. Real-Time Systems, 49(2):140-170, 2013. Google Scholar
  14. Glenn Elliott et al. GPUSync: A framework for real-time GPU management. In IEEE Real-Time Systems Symposium (RTSS), 2013. Google Scholar
  15. Mingcong Han, Hanze Zhang, Rong Chen, and Haibo Chen. Microsecond-scale preemption for concurrent GPU-accelerated DNN inferences. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22), pages 539-558, Carlsbad, CA, July 2022. USENIX Association. URL: https://www.usenix.org/conference/osdi22/presentation/han.
  16. S. Jain, I. Baek, S. Wang, and R. Rajkumar. Fractional GPUs: Software-based compute and memory bandwidth reservation for GPUs. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 29-41, 2019. URL: https://doi.org/10.1109/RTAS.2019.00011.
  17. S. Kato, K. Lakshmanan, A. Kumar, M. Kelkar, Y. Ishikawa, and R. Rajkumar. RGEM: A responsive GPGPU execution model for runtime engines. In 2011 IEEE 32nd Real-Time Systems Symposium, pages 57-66, 2011. URL: https://doi.org/10.1109/RTSS.2011.13.
  18. H. Kim, P. Patel, S. Wang, and R. R. Rajkumar. A server-based approach for predictable GPU access control. In 2017 IEEE 23rd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pages 1-10, 2017. URL: https://doi.org/10.1109/RTCSA.2017.8046309.
  19. Jinkyu Lee. Improved schedulability analysis using carry-in limitation for non-preemptive fixed-priority multiprocessor scheduling. IEEE Transactions on Computers, 66(10):1816-1823, 2017. Google Scholar
  20. Pratyush Patel, Iljoo Baek, Hyoseung Kim, and Ragunathan Rajkumar. Analytical enhancements and practical insights for MPCP with self-suspensions. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2018. Google Scholar
  21. Ragunathan Rajkumar. Real-time synchronization protocols for shared memory multiprocessors. In Proceedings., 10th International Conference on Distributed Computing Systems, pages 116-117. IEEE Computer Society, 1990. Google Scholar
  22. Steven Rostedt. Rt-mutex. https://docs.kernel.org/locking/rt-mutex-design.html, 2009.
  23. S. Saha, Y. Xiang, and H. Kim. STGM: Spatio-temporal GPU management for real-time tasks. In 2019 IEEE 25th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pages 1-6, 2019. URL: https://doi.org/10.1109/RTCSA.2019.8864564.
  24. I. Tanasic, I. Gelado, J. Cabezas, A. Ramirez, N. Navarro, and M. Valero. Enabling preemptive multiprogramming on GPUs. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), pages 193-204, 2014. URL: https://doi.org/10.1109/ISCA.2014.6853208.
  25. Yidi Wang. rtenlab/gcaps-super-repo. Software (visited on 06/06/2024). URL: https://github.com/rtenlab/gcaps-super-repo.
  26. Yidi Wang. Advancing Real-Time GPU Scheduling: Energy Efficiency and Preemption Strategies. PhD thesis, University of California, Riverside, 2023. Google Scholar
  27. Yidi Wang, Mohsen Karimi, and Hyoseung Kim. Towards Energy-Efficient Real-Time Scheduling of Heterogeneous Multi-GPU Systems. In 2022 IEEE Real-Time Systems Symposium (RTSS), pages 409-421. IEEE, 2022. Google Scholar
  28. Yidi Wang, Mohsen Karimi, Yecheng Xiang, and Hyoseung Kim. Balancing energy efficiency and real-time performance in GPU scheduling. In 2021 IEEE Real-Time Systems Symposium (RTSS), pages 110-122. IEEE, 2021. Google Scholar
  29. Yidi Wang, Cong Liu, Daniel Wong, and Hyoseung Kim. Unleashing the power of preemptive priority-based scheduling for real-time gpu tasks, 2024. URL: https://arxiv.org/abs/2401.16529.
  30. Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, and Jeffrey Vetter. Enabling and exploiting flexible task assignment on GPU through SM-centric program transformations. In Proceedings of the 29th ACM on International Conference on Supercomputing, pages 119-130, 2015. Google Scholar
  31. Yecheng Xiang and Hyoseung Kim. Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference. In 2019 IEEE Real-Time Systems Symposium (RTSS), pages 392-405. IEEE, 2019. Google Scholar
  32. H. Zhou, G. Tong, and C. Liu. GPES: a preemptive execution system for GPGPU computing. In 21st IEEE Real-Time and Embedded Technology and Applications Symposium, pages 87-97, 2015. URL: https://doi.org/10.1109/RTAS.2015.7108420.
  33. An Zou, Jing Li, Christopher D Gill, and Xuan Zhang. RTGPU: Real-time GPU scheduling of hard deadline parallel tasks with fine-grain utilization. IEEE Transactions on Parallel and Distributed Systems, 2023. Google Scholar