DROPS

Volume

OASIcs, Volume 127

16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

PARMA-DITAM 2025, January 22, 2025, Barcelona, Spain

Editors: Daniele Cattaneo, Maria Fazio, Leonidas Kosmidis, and Gabriele Morabito

Document

DOI: 10.4230/LIPIcs.ECRTS.2025.20

Detecting Low-Density Mixtures in High-Quantile Tails for pWCET Estimation

Authors: Blau Manau, Sergi Vilardell, Isabel Serra, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla

Published in: LIPIcs, Volume 335, 37th Euromicro Conference on Real-Time Systems (ECRTS 2025)

Abstract

The variability arising from sophisticated hardware and software solutions in cutting-edge embedded products causes software to exhibit complex execution time distributions. Mixture distributions can happen, with different density (weight), as a result of inherent different features in the execution platform and multiple operational scenarios. In the context of probabilistic WCET (pWCET) analysis based on Extreme Value Theory (EVT), where identical distribution is a pre-requisite, mixtures are typically intercepted by applying stationarity tests on the full sample. Those tests, however, are instructed to detect only mixtures with sufficiently high probability (weight) and disregard low-density mixtures (which are unlikely to be preserved in the high-quantile tail of the sample) as they would prevent any form of stationarity. Nonetheless, low-density mixture distributions can persist and even exacerbate in the tail, and, when not considered, they can impair pWCET estimation in EVT-based approaches, leading to overly pessimistic or optimistic bounds. In this work, we propose TailID, an iterative point-wise approach that builds on the asymptotic convergence of the Maximum Likelihood Estimator (MLE) of the Extreme Value Index (EVI) parameter ξ to detect low-density mixture distributions on high-quantile tails and use this information to steer EVT tail selection. The benefits of the proposed method are assessed on synthetic mixture distributions and real data collected on an industrially representative embedded platform.

Cite as

Blau Manau, Sergi Vilardell, Isabel Serra, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla. Detecting Low-Density Mixtures in High-Quantile Tails for pWCET Estimation. In 37th Euromicro Conference on Real-Time Systems (ECRTS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 335, pp. 20:1-20:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{manau_et_al:LIPIcs.ECRTS.2025.20,
  author =	{Manau, Blau and Vilardell, Sergi and Serra, Isabel and Mezzetti, Enrico and Abella, Jaume and Cazorla, Francisco J.},
  title =	{{Detecting Low-Density Mixtures in High-Quantile Tails for pWCET Estimation}},
  booktitle =	{37th Euromicro Conference on Real-Time Systems (ECRTS 2025)},
  pages =	{20:1--20:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-377-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{335},
  editor =	{Mancuso, Renato},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2025.20},
  URN =		{urn:nbn:de:0030-drops-235982},
  doi =		{10.4230/LIPIcs.ECRTS.2025.20},
  annote =	{Keywords: WCET, EVT}
}

Document

DOI: 10.4230/OASIcs.NG-RES.2025.5

SP-IMPact: A Framework for Static Partitioning Interference Mitigation and Performance Analysis

Authors: Diogo Costa, Gonçalo Moreira, Afonso Oliveira, José Martins, and Sandro Pinto

Published in: OASIcs, Volume 128, Sixth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2025)

Abstract

Modern embedded systems are evolving toward complex, heterogeneous architectures to accommodate increasingly demanding applications. Driven by industry SWAP-C (Size, Weight, Power, and Cost) constraints, this shift has led to the consolidation of multiple systems onto single hardware platforms. Static Partitioning Hypervisors (SPHs) offer a promising solution to partition hardware resources and provide spatial isolation between critical workloads. However, shared hardware resources like the Last-Level Cache (LLC) and system bus can introduce significant temporal interference between virtual machines (VMs), negatively impacting performance and predictability. Over the past decade, academia and industry have focused on developing interference mitigation techniques, such as cache partitioning and memory bandwidth reservation. Configuring these techniques, however, is complex and time-consuming. Cache partitioning requires careful balancing of cache sections across VMs, while memory bandwidth reservation requires tuning bandwidth budgets and periods. With numerous possible configurations, testing all combinations is impractical and often leads to suboptimal configurations. Moreover, there is a gap in understanding how these techniques interact, as their combined use can result in compounded or conflicting effects on system performance. Static analysis solutions that estimate worst-case execution times (WCET) and upper bounds on execution times provide some guidance for configuring interference mitigation techniques. While useful in identifying potential interference effects, these tools often fail to capture the full complexity of modern multi-core systems, as they typically focus on a limited set of shared resources and neglect other sources of contention, such as IOMMUs and interrupt controllers. To address these challenges, we introduce SP-IMPact, an open-source framework designed to analyze and guide the configuration of interference mitigation techniques, through the deployment of diverse VM configurations and setups, and assessment of hardware-level contention (leveraging SPHs). It supports two mitigation techniques: (i) cache coloring and (ii) memory bandwidth reservation, while also evaluating the interactions between these techniques and their cumulative impact on system performance. By providing insights on real hardware platforms, SP-IMPact helps to optimize the configuration of these techniques in mixed-criticality systems, ensuring both performance and predictability.

Cite as

Diogo Costa, Gonçalo Moreira, Afonso Oliveira, José Martins, and Sandro Pinto. SP-IMPact: A Framework for Static Partitioning Interference Mitigation and Performance Analysis. In Sixth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2025). Open Access Series in Informatics (OASIcs), Volume 128, pp. 5:1-5:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{costa_et_al:OASIcs.NG-RES.2025.5,
  author =	{Costa, Diogo and Moreira, Gon\c{c}alo and Oliveira, Afonso and Martins, Jos\'{e} and Pinto, Sandro},
  title =	{{SP-IMPact: A Framework for Static Partitioning Interference Mitigation and Performance Analysis}},
  booktitle =	{Sixth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2025)},
  pages =	{5:1--5:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-366-9},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{128},
  editor =	{Yomsi, Patrick Meumeu and Wildermann, Stefan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.NG-RES.2025.5},
  URN =		{urn:nbn:de:0030-drops-229911},
  doi =		{10.4230/OASIcs.NG-RES.2025.5},
  annote =	{Keywords: Virtualization, Contention, Multi-core Interference, Mixed-Criticality Systems, Arm}
}

Document

DOI: 10.4230/OASIcs.NG-RES.2025.4

H-MBR: Hypervisor-Level Memory Bandwidth Reservation for Mixed Criticality Systems

Authors: Afonso Oliveira, Diogo Costa, Gonçalo Moreira, José Martins, and Sandro Pinto

Published in: OASIcs, Volume 128, Sixth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2025)

Abstract

Recent advancements in fields such as automotive and aerospace have driven a growing demand for robust computational resources. Applications that were once designed for basic Microcontroller Units (MCUs) are now deployed on highly heterogeneous System-on-Chip (SoC) platforms. While these platforms deliver the necessary computational performance, they also present challenges related to resource sharing and predictability. These challenges are particularly pronounced when consolidating safety-critical and non-safety-critical systems, the so-called Mixed-Criticality Systems (MCS) to adhere to strict Size, Weight, Power, and Cost (SWaP-C) requirements. MCS consolidation on shared platforms requires stringent spatial and temporal isolation to comply with functional safety standards (e.g., ISO 26262). Virtualization, mainly leveraged by hypervisors, is a key technology that ensures spatial isolation across multiple OSes and applications; however ensuring temporal isolation remains challenging due to contention on shared resources, such as main memory, caches, and system buses, which impacts real-time performance and predictability. To mitigate this problem, several strategies (e.g., cache coloring and memory bandwidth reservation) have been proposed. Although cache coloring is typically implemented on state-of-the-art hypervisors, memory bandwidth reservation approaches are commonly implemented at the Linux kernel level or rely on dedicated hardware and typically do not consider the concept of Virtual Machines that can run different OSes. To fill the gap between current memory bandwidth reservation solutions and the deployment of MCSs that operate on a hypervisor, this work introduces H-MBR, an open-source VM-centric memory bandwidth reservation mechanism. H-MBR features (i) VM-centric bandwidth reservation, (ii) OS and platform agnosticism, and (iii) reduced overhead. Empirical results evidenced no overhead on non-regulated workloads, and negligible overhead (<1%) for regulated workloads for regulation periods of 2 µs or higher.

Cite as

Afonso Oliveira, Diogo Costa, Gonçalo Moreira, José Martins, and Sandro Pinto. H-MBR: Hypervisor-Level Memory Bandwidth Reservation for Mixed Criticality Systems. In Sixth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2025). Open Access Series in Informatics (OASIcs), Volume 128, pp. 4:1-4:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{oliveira_et_al:OASIcs.NG-RES.2025.4,
  author =	{Oliveira, Afonso and Costa, Diogo and Moreira, Gon\c{c}alo and Martins, Jos\'{e} and Pinto, Sandro},
  title =	{{H-MBR: Hypervisor-Level Memory Bandwidth Reservation for Mixed Criticality Systems}},
  booktitle =	{Sixth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2025)},
  pages =	{4:1--4:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-366-9},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{128},
  editor =	{Yomsi, Patrick Meumeu and Wildermann, Stefan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.NG-RES.2025.4},
  URN =		{urn:nbn:de:0030-drops-229905},
  doi =		{10.4230/OASIcs.NG-RES.2025.4},
  annote =	{Keywords: Virtualization, Multi-core Interference, Mixed-Criticality Systems, Arm, Memory Bandwidth Reservation}
}

Artifact

Dataset

DOI: 10.4230/artifacts.22756

marcosrc92/MARS-data

Authors: Marcos Rodriguez

Abstract

Cite as

Marcos Rodriguez. marcosrc92/MARS-data (Dataset, Experiment results). Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@misc{dagstuhl-artifact-22756,
   title = {{marcosrc92/MARS-data}}, 
   author = {Rodriguez, Marcos},
   note = {Dataset, swhId: \href{https://archive.softwareheritage.org/swh:1:dir:0b2fbba6fbdfa60cb3d84175a2dd102bfb293ff2;origin=https://github.com/marcosrc92/MARS-data;visit=swh:1:snp:6f1643cf47dbc1506addc434ecc22f14c0730e65;anchor=swh:1:rev:6a4e0d0c5bdb8978113179478ef33cc5b9781fe9}{\texttt{swh:1:dir:0b2fbba6fbdfa60cb3d84175a2dd102bfb293ff2}} (visited on 2025-02-27)},
   url = {https://github.com/marcosrc92/MARS-data},
   doi = {10.4230/artifacts.22756},
}

Document

Complete Volume

DOI: 10.4230/OASIcs.PARMA-DITAM.2025

OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume

Authors: Daniele Cattaneo, Maria Fazio, Leonidas Kosmidis, and Gabriele Morabito

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume

Cite as

16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 1-100, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Proceedings{cattaneo_et_al:OASIcs.PARMA-DITAM.2025,
  title =	{{OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{1--100},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025},
  URN =		{urn:nbn:de:0030-drops-229378},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025},
  annote =	{Keywords: OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume}
}

Document

Front Matter

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.0

Front Matter, Table of Contents, Preface, Conference Organization

Authors: Daniele Cattaneo, Maria Fazio, Leonidas Kosmidis, and Gabriele Morabito

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

Front Matter, Table of Contents, Preface, Conference Organization

Cite as

16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 0:i-0:x, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{cattaneo_et_al:OASIcs.PARMA-DITAM.2025.0,
  author =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{0:i--0:x},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.0},
  URN =		{urn:nbn:de:0030-drops-229363},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}

@InProceedings{cattaneo_et_al:OASIcs.PARMA-DITAM.2025.0,
  author =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{0:i--0:x},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.0},
  URN =		{urn:nbn:de:0030-drops-229363},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.6

HiPART: High-Performance Technology for Advanced Real-Time Systems

Authors: Sara Royuela, Adrian Munera, Chenle Yu, and Josep Pinot

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

Cyber-physical systems (CPS) attempt to meet real-time and safety requirements by using hypervisors that provide isolation via virtualisation and Real-Time Operating Systems that manage the concurrency of system tasks. However, the operating system’s (OS) decisions may hinder the efficiency of tasks because it needs more awareness of their specific intricacies. Hence, one critical limitation to efficiently developing CPSs is the lack of tailored parallel programming models that can harness the capabilities of advanced heterogeneous architectures while meeting the requirements integral to CPSs, such as real-time behaviour and safety requirements. While conventional HPC languages, like OpenMP and CUDA, cannot accommodate critical non-functional properties, safety languages, like Rust and Ada, are limited in their capabilities to exploit complex systems efficiently. On top of that, accessibility to the programming task is essential to making the system usable to different domain experts. HiPART tackles these challenges by developing a comprehensive framework holistically addressing efficiency, interoperability, reliability, and sustainability. The HiPART framework, based on OpenMP, provides tailored support for (1) real-time behaviour and safety requirements and (2) the efficient exploitation of advanced parallel and heterogeneous processor architectures. This support is exposed to users through extensions to the OpenMP specification and its implementation in the LLVM framework, including the compiler and the OpenMP runtime library. With this framework, HiPART will contribute to realising more capable and reliable autonomous systems across various domains, from autonomous mobility to space exploration.

Cite as

Sara Royuela, Adrian Munera, Chenle Yu, and Josep Pinot. HiPART: High-Performance Technology for Advanced Real-Time Systems. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 6:1-6:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{royuela_et_al:OASIcs.PARMA-DITAM.2025.6,
  author =	{Royuela, Sara and Munera, Adrian and Yu, Chenle and Pinot, Josep},
  title =	{{HiPART: High-Performance Technology for Advanced Real-Time Systems}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{6:1--6:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.6},
  URN =		{urn:nbn:de:0030-drops-229108},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.6},
  annote =	{Keywords: Cyber-physical systems, OpenMP, Parallel and heterogeneous architectures, Efficiency, Adaptability, Interoperability, Real-time, Resilience, Reliability}
}

@InProceedings{royuela_et_al:OASIcs.PARMA-DITAM.2025.6,
  author =	{Royuela, Sara and Munera, Adrian and Yu, Chenle and Pinot, Josep},
  title =	{{HiPART: High-Performance Technology for Advanced Real-Time Systems}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{6:1--6:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.6},
  URN =		{urn:nbn:de:0030-drops-229108},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.6},
  annote =	{Keywords: Cyber-physical systems, OpenMP, Parallel and heterogeneous architectures, Efficiency, Adaptability, Interoperability, Real-time, Resilience, Reliability}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.3

System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric

Authors: Vittoriano Muttillo, Vincenzo Stoico, Giacomo Valente, Marco Santic, Luigi Pomante, and Daniele Frigioni

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

The rapidly increasing complexity of embedded systems and the critical impact of non-functional requirements demand the adoption of an appropriate system-level HW/SW co-design methodology. This methodology tries to satisfy all design requirements by simultaneously considering several alternative HW/SW implementations. In this context, early performance estimation approaches are crucial in reducing the design space, thereby minimizing design time and cost. To address the challenge of system-level performance estimation, this work presents and formalizes a novel approach based on a unifying HW/SW performance metric for early execution time estimation. The proposed approach estimates the execution time of a C function when executed by different HW/SW processor technologies. The approach is validated through an extensive experimental study, demonstrating its effectiveness and efficiency in terms of estimation error (i.e., lower than 10%) and estimation time (close to zero) when compared to existing methods in the literature.

Cite as

Vittoriano Muttillo, Vincenzo Stoico, Giacomo Valente, Marco Santic, Luigi Pomante, and Daniele Frigioni. System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 3:1-3:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{muttillo_et_al:OASIcs.PARMA-DITAM.2025.3,
  author =	{Muttillo, Vittoriano and Stoico, Vincenzo and Valente, Giacomo and Santic, Marco and Pomante, Luigi and Frigioni, Daniele},
  title =	{{System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{3:1--3:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.3},
  URN =		{urn:nbn:de:0030-drops-229071},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.3},
  annote =	{Keywords: embedded systems, hw/sw co-design, performance estimation, lasso, machine learning}
}

@InProceedings{muttillo_et_al:OASIcs.PARMA-DITAM.2025.3,
  author =	{Muttillo, Vittoriano and Stoico, Vincenzo and Valente, Giacomo and Santic, Marco and Pomante, Luigi and Frigioni, Daniele},
  title =	{{System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{3:1--3:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.3},
  URN =		{urn:nbn:de:0030-drops-229071},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.3},
  annote =	{Keywords: embedded systems, hw/sw co-design, performance estimation, lasso, machine learning}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.2

Custom Floating-Point Computations for the Optimization of ODE Solvers on FPGA

Authors: Serena Curzel and Marco Gribaudo

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

Mean Field Analysis and Markovian Agents are powerful techniques for modeling complex systems of distributed interacting objects, for which efficient analytical and numerical solution algorithms can be implemented through linear systems of ordinary differential equations (ODEs). Solving such ODE systems on Field Programmable Gate Arrays (FPGAs) is a promising alternative to traditional CPU- and GPU-based approaches, especially in terms of energy consumption; however, the floating-point computations required are generally thought to be slow and inefficient when implemented on FPGA. In this paper, we demonstrate the use of High-Level Synthesis with automated customization of low-precision floating-point calculations, obtaining hardware accelerators for ODE solvers with improved quality of results and minimal output error. The proposed methodology does not require any manual rewriting of the solver code, but it remains prohibitively slow to evaluate any possible floating-point configuration through logic synthesis; in the future, we will thus implement automated design space exploration methods able to suggest promising configurations under user-defined accuracy and performance constraints.

Cite as

Serena Curzel and Marco Gribaudo. Custom Floating-Point Computations for the Optimization of ODE Solvers on FPGA. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 2:1-2:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{curzel_et_al:OASIcs.PARMA-DITAM.2025.2,
  author =	{Curzel, Serena and Gribaudo, Marco},
  title =	{{Custom Floating-Point Computations for the Optimization of ODE Solvers on FPGA}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{2:1--2:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.2},
  URN =		{urn:nbn:de:0030-drops-229064},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.2},
  annote =	{Keywords: Differential Equations, High-Level Synthesis, FPGA, floating-point}
}

@InProceedings{curzel_et_al:OASIcs.PARMA-DITAM.2025.2,
  author =	{Curzel, Serena and Gribaudo, Marco},
  title =	{{Custom Floating-Point Computations for the Optimization of ODE Solvers on FPGA}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{2:1--2:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.2},
  URN =		{urn:nbn:de:0030-drops-229064},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.2},
  annote =	{Keywords: Differential Equations, High-Level Synthesis, FPGA, floating-point}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.4

Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability

Authors: Pau López Castillón, Xavier Caricchio Hernández, and Leonidas Kosmidis

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

The evolution of Graphics Processing Unit (GPU) compilers has facilitated the support for general-purpose programming languages across various architectures. The NVIDIA CUDA Compiler (NVCC) employs multiple compilation levels prior to generating machine code, implementing intricate optimizations to enhance performance. These optimizations influence the manner in which software is mapped to the underlying hardware, which can also impact GPU reliability. TASA is a source-to-source code randomization tool designed to alter the mapping of software onto the underlying hardware. It achieves this by generating random permutations of variable and function declarations, thereby introducing random padding between declarations of different types and modifying the program memory layout. Since this modifies their location in the memory, it also modifies their cache placement, affecting both their execution time (due to the different conflicts between them, which result in a different amount of cache misses in every execution), as well as their lifetime in the cache. In this work, which is part of the HiPEAC Student Challenge 2025, we first examine the reproducibility of a subset of data presented in the ACM TACO paper "Assessing the Impact of Compiler Optimizations on GPU Reliability" [Santos et al., 2024], and second we extend it by combining it with our proposal of software randomization. The paper indicates that the -O3 optimization flag facilitates an increased workload before failures occur within the application. By employing TASA, we investigate the impact of GPU randomization on reliability and performance metrics. By reproducing the results of the paper on a different GPU platform, we observe the same trend as reported in the original publication. Moreover, our preliminary results with the application of software randomization show in several cases an improved Mean Waiting Before Failure (MWBF) compared to the original source code.

Cite as

Pau López Castillón, Xavier Caricchio Hernández, and Leonidas Kosmidis. Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 4:1-4:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{castillon_et_al:OASIcs.PARMA-DITAM.2025.4,
  author =	{Castill\'{o}n, Pau L\'{o}pez and Hern\'{a}ndez, Xavier Caricchio and Kosmidis, Leonidas},
  title =	{{Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{4:1--4:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.4},
  URN =		{urn:nbn:de:0030-drops-229083},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.4},
  annote =	{Keywords: Graphics processing units, reliability, software randomization, error rate}
}

@InProceedings{castillon_et_al:OASIcs.PARMA-DITAM.2025.4,
  author =	{Castill\'{o}n, Pau L\'{o}pez and Hern\'{a}ndez, Xavier Caricchio and Kosmidis, Leonidas},
  title =	{{Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{4:1--4:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.4},
  URN =		{urn:nbn:de:0030-drops-229083},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.4},
  annote =	{Keywords: Graphics processing units, reliability, software randomization, error rate}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.5

Evaluation of the Parallel Features of Rust for Space Systems

Authors: Alberto Perugini and Leonidas Kosmidis

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

The rise in complexity of the algorithms run on space systems, largely attributable to higher resolution instruments which generate a large amount of the data to be processed, as well as to the need for increased autonomy, which relies on Neural Network inference systems in future missions, demand the adoption of more powerful on-board hardware, such as multicores. At the same time, the correctness and reliability of critical on-board software is of paramount importance for the success of space missions. However, developing such complex software in low-level languages can have a negative impact on these aspects. For this reason, this paper evaluates the role that the Rust programming language can have in this change, given its memory safety and built in support for parallelism, which allows to better utilise more powerful hardware, in particular multicore cpus, without compromising the programmability and safety of the code. To this end, the GPU4S benchmarking suite, part of the open source OBPMark benchmarking suite of the European Space Agency (ESA), is ported to Rust, with sequential and parallel implementations. The performance of the ported benchmarks is compared to the existing sequential and parallel implementations in low-level languages to evaluate the trade-offs of the different solutions, and it is evaluated on several multicore platforms which are candidates for future on-board processing systems. A particular focus is put on parallel versions of the benchmarks, where Rust offers solid native support, as well as library support for fast parallelization similar to OpenMP. Finally, in terms of correctness, the Rust implementations are free of recently detected defects in the low-level implementations of the GPU4S benchmarks.

Cite as

Alberto Perugini and Leonidas Kosmidis. Evaluation of the Parallel Features of Rust for Space Systems. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{perugini_et_al:OASIcs.PARMA-DITAM.2025.5,
  author =	{Perugini, Alberto and Kosmidis, Leonidas},
  title =	{{Evaluation of the Parallel Features of Rust for Space Systems}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{5:1--5:20},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.5},
  URN =		{urn:nbn:de:0030-drops-229098},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.5},
  annote =	{Keywords: Rust, Multicore, Space Systems}
}

@InProceedings{perugini_et_al:OASIcs.PARMA-DITAM.2025.5,
  author =	{Perugini, Alberto and Kosmidis, Leonidas},
  title =	{{Evaluation of the Parallel Features of Rust for Space Systems}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{5:1--5:20},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.5},
  URN =		{urn:nbn:de:0030-drops-229098},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.5},
  annote =	{Keywords: Rust, Multicore, Space Systems}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.1

Analysis of GPU Memory Allocation Characteristics

Authors: Marcos Rodriguez, Irune Yarza, Leonidas Kosmidis, and Alejandro J. Calderón

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

The number of applications subject to safety-critical regulations is on the rise, and consequently, the computing requirements for such applications are increasing as well. This trend has led to the integration of General-Purpose Graphics Processing Units (GPGPUs) into these systems. However, the inherent characteristics of GPGPUs, including their black-box nature, dynamic allocation mechanisms, and frequent use of pointers, present challenges in certifying these applications for safety-critical systems. This paper aims to shed light on the unique characteristics of GPU programs and how they impact the certification process. To achieve this goal, several allocation methods are rigorously evaluated to determine which one is best suited to an application, regarding the program characteristics within the safety-critical domain. By conducting this evaluation, we seek to provide insights into the complexities of GPU memory accesses and its compatibility with safety-critical requirements. The ultimate objective is to offer recommendations on the most appropriate allocation method based on the unique needs of each application, thus contributing to the safe and reliable integration of GPGPUs into safety-critical systems.

Cite as

Marcos Rodriguez, Irune Yarza, Leonidas Kosmidis, and Alejandro J. Calderón. Analysis of GPU Memory Allocation Characteristics. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 1:1-1:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{rodriguez_et_al:OASIcs.PARMA-DITAM.2025.1,
  author =	{Rodriguez, Marcos and Yarza, Irune and Kosmidis, Leonidas and Calder\'{o}n, Alejandro J.},
  title =	{{Analysis of GPU Memory Allocation Characteristics}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{1:1--1:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.1},
  URN =		{urn:nbn:de:0030-drops-229057},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.1},
  annote =	{Keywords: CUDA, Memory allocation, Rodinia, Embedded}
}

@InProceedings{rodriguez_et_al:OASIcs.PARMA-DITAM.2025.1,
  author =	{Rodriguez, Marcos and Yarza, Irune and Kosmidis, Leonidas and Calder\'{o}n, Alejandro J.},
  title =	{{Analysis of GPU Memory Allocation Characteristics}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{1:1--1:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.1},
  URN =		{urn:nbn:de:0030-drops-229057},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.1},
  annote =	{Keywords: CUDA, Memory allocation, Rodinia, Embedded}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2019.23

Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier

Authors: Roger Pujol, Hamid Tabani, Leonidas Kosmidis, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla

Published in: LIPIcs, Volume 133, 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)

Abstract

Deep learning-based solutions and, in particular, deep neural networks (DNNs) are at the heart of several functionalities in critical-real time embedded systems (CRTES) from vision-based perception (object detection and tracking) systems to trajectory planning. As a result, several DNN instances simultaneously run at any time on the same computing platform. However, while modern GPUs offer a variety of computing elements (e.g. CPUs, GPUs, and specific accelerators) in which those DNN tasks can be executed depending on their computational requirements and temporal constraints, current DNNs are mainly programmed to exploit one of them, namely, regular cores in the GPU. This creates resource imbalance and under-utilization of GPU resources when executing several DNN instances, causing an increase in DNN tasks' execution time requirements. In this paper, (a) we develop different variants (implementations) of well-known DNN libraries used in the Apollo Autonomous Driving (AD) software for each of the computing elements of the latest NVIDIA Xavier SoC. Each variant can be configured to balance resource requirements and performance: the regular CPU core implementation that can run on 2, 4, and 6 cores; the GPU regular and Tensor core variants that can run in 4 or 8 GPU’s Streaming Multiprocessors (SM); and 1 or 2 NVIDIA’s Deep Learning Accelerators (NVDLA); (b) we show that each particular variant/configuration offers a different resource utilization/performance point; finally, (c) we show how those heterogeneous computing elements can be exploited by a static scheduler to sustain the execution of multiple and diverse DNN variants on the same platform.

Cite as

Roger Pujol, Hamid Tabani, Leonidas Kosmidis, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla. Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 133, pp. 23:1-23:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{pujol_et_al:LIPIcs.ECRTS.2019.23,
  author =	{Pujol, Roger and Tabani, Hamid and Kosmidis, Leonidas and Mezzetti, Enrico and Abella, Jaume and Cazorla, Francisco J.},
  title =	{{Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier}},
  booktitle =	{31st Euromicro Conference on Real-Time Systems (ECRTS 2019)},
  pages =	{23:1--23:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-110-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{133},
  editor =	{Quinton, Sophie},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2019.23},
  URN =		{urn:nbn:de:0030-drops-107608},
  doi =		{10.4230/LIPIcs.ECRTS.2019.23},
  annote =	{Keywords: Deep Neural Network (DNN), GPU, Heterogenous Resources}
}

Document

DOI: 10.4230/OASIcs.WCET.2016.9

Measurement-Based Timing Analysis of the AURIX Caches

Authors: Leonidas Kosmidis, Davide Compagnin, David Morales, Enrico Mezzetti, Eduardo Quinones, Jaume Abella, Tullio Vardanega, and Francisco J. Cazorla

Published in: OASIcs, Volume 55, 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016)

Abstract

Cache memories are one of the hardware resources with higher potential to reduce worst-case execution time (WCET) costs for software programs with tight real-time constraints. Yet, the complexity of cache analysis has caused a large fraction of real-time systems industry to avoid using them, especially in the automotive sector. For measurement-based timing analysis (MBTA) - the dominant technique in domains such as automotive - cache challenges the definition of test scenarios stressful enough to produce (cache) layouts that causing high contention. In this paper, we present our experience in enabling the use of caches for a real automotive application running on an AURIX multiprocessor, using software randomization and measurement-based probabilistic timing analysis (MBPTA). Our results show that software randomization successfully exposes - in the experiments performed for timing analysis - cache related variability, in a manner that can be effectively captured by MBPTA.

Cite as

Leonidas Kosmidis, Davide Compagnin, David Morales, Enrico Mezzetti, Eduardo Quinones, Jaume Abella, Tullio Vardanega, and Francisco J. Cazorla. Measurement-Based Timing Analysis of the AURIX Caches. In 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016). Open Access Series in Informatics (OASIcs), Volume 55, pp. 9:1-9:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{kosmidis_et_al:OASIcs.WCET.2016.9,
  author =	{Kosmidis, Leonidas and Compagnin, Davide and Morales, David and Mezzetti, Enrico and Quinones, Eduardo and Abella, Jaume and Vardanega, Tullio and Cazorla, Francisco J.},
  title =	{{Measurement-Based Timing Analysis of the AURIX Caches}},
  booktitle =	{16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016)},
  pages =	{9:1--9:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-025-5},
  ISSN =	{2190-6807},
  year =	{2016},
  volume =	{55},
  editor =	{Schoeberl, Martin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.WCET.2016.9},
  URN =		{urn:nbn:de:0030-drops-69028},
  doi =		{10.4230/OASIcs.WCET.2016.9},
  annote =	{Keywords: WCET, caches, AURIX, Automotive}
}

18 Search Results for "Kosmidis, Leonidas"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message