,
James H. Anderson
Creative Commons Attribution 4.0 International license
As GPU-using tasks become more common in embedded, safety-critical systems, efficiency demands necessitate sharing a single GPU among multiple tasks. Unfortunately, existing ways to schedule multiple tasks onto a GPU often either result in a loss of ability to meet deadlines, or a loss of efficiency. In this work, we develop a system-level spatial compute partitioning mechanism for NVIDIA GPUs and demonstrate that it can be used to execute tasks efficiently without compromising timing predictability. Our tool, called nvtaskset, supports composable systems by not requiring task, driver, or hardware modifications. In our evaluation, we demonstrate sub-1-μs overheads, stronger partition enforcement, and finer-granularity partitioning when using our mechanism instead of NVIDIA’s Multi-Process Service (MPS) or Multi-instance GPU (MiG) features.
@InProceedings{bakita_et_al:LIPIcs.ECRTS.2025.21,
author = {Bakita, Joshua and Anderson, James H.},
title = {{Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems}},
booktitle = {37th Euromicro Conference on Real-Time Systems (ECRTS 2025)},
pages = {21:1--21:25},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-377-5},
ISSN = {1868-8969},
year = {2025},
volume = {335},
editor = {Mancuso, Renato},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2025.21},
URN = {urn:nbn:de:0030-drops-235998},
doi = {10.4230/LIPIcs.ECRTS.2025.21},
annote = {Keywords: Real-time systems, composable systems, graphics processing units, CUDA}
}