Search Results

Documents authored by Kosmidis, Leonidas


Document
Complete Volume
OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume

Authors: Daniele Cattaneo, Maria Fazio, Leonidas Kosmidis, and Gabriele Morabito

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)


Abstract
OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume

Cite as

16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 1-100, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Proceedings{cattaneo_et_al:OASIcs.PARMA-DITAM.2025,
  title =	{{OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{1--100},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025},
  URN =		{urn:nbn:de:0030-drops-229378},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025},
  annote =	{Keywords: OASIcs, Volume 127, PARMA-DITAM 2025, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Daniele Cattaneo, Maria Fazio, Leonidas Kosmidis, and Gabriele Morabito

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 0:i-0:x, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{cattaneo_et_al:OASIcs.PARMA-DITAM.2025.0,
  author =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{0:i--0:x},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.0},
  URN =		{urn:nbn:de:0030-drops-229363},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Analysis of GPU Memory Allocation Characteristics

Authors: Marcos Rodriguez, Irune Yarza, Leonidas Kosmidis, and Alejandro J. Calderón

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)


Abstract
The number of applications subject to safety-critical regulations is on the rise, and consequently, the computing requirements for such applications are increasing as well. This trend has led to the integration of General-Purpose Graphics Processing Units (GPGPUs) into these systems. However, the inherent characteristics of GPGPUs, including their black-box nature, dynamic allocation mechanisms, and frequent use of pointers, present challenges in certifying these applications for safety-critical systems. This paper aims to shed light on the unique characteristics of GPU programs and how they impact the certification process. To achieve this goal, several allocation methods are rigorously evaluated to determine which one is best suited to an application, regarding the program characteristics within the safety-critical domain. By conducting this evaluation, we seek to provide insights into the complexities of GPU memory accesses and its compatibility with safety-critical requirements. The ultimate objective is to offer recommendations on the most appropriate allocation method based on the unique needs of each application, thus contributing to the safe and reliable integration of GPGPUs into safety-critical systems.

Cite as

Marcos Rodriguez, Irune Yarza, Leonidas Kosmidis, and Alejandro J. Calderón. Analysis of GPU Memory Allocation Characteristics. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 1:1-1:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{rodriguez_et_al:OASIcs.PARMA-DITAM.2025.1,
  author =	{Rodriguez, Marcos and Yarza, Irune and Kosmidis, Leonidas and Calder\'{o}n, Alejandro J.},
  title =	{{Analysis of GPU Memory Allocation Characteristics}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{1:1--1:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.1},
  URN =		{urn:nbn:de:0030-drops-229057},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.1},
  annote =	{Keywords: CUDA, Memory allocation, Rodinia, Embedded}
}
Document
Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability

Authors: Pau López Castillón, Xavier Caricchio Hernández, and Leonidas Kosmidis

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)


Abstract
The evolution of Graphics Processing Unit (GPU) compilers has facilitated the support for general-purpose programming languages across various architectures. The NVIDIA CUDA Compiler (NVCC) employs multiple compilation levels prior to generating machine code, implementing intricate optimizations to enhance performance. These optimizations influence the manner in which software is mapped to the underlying hardware, which can also impact GPU reliability. TASA is a source-to-source code randomization tool designed to alter the mapping of software onto the underlying hardware. It achieves this by generating random permutations of variable and function declarations, thereby introducing random padding between declarations of different types and modifying the program memory layout. Since this modifies their location in the memory, it also modifies their cache placement, affecting both their execution time (due to the different conflicts between them, which result in a different amount of cache misses in every execution), as well as their lifetime in the cache. In this work, which is part of the HiPEAC Student Challenge 2025, we first examine the reproducibility of a subset of data presented in the ACM TACO paper "Assessing the Impact of Compiler Optimizations on GPU Reliability" [Santos et al., 2024], and second we extend it by combining it with our proposal of software randomization. The paper indicates that the -O3 optimization flag facilitates an increased workload before failures occur within the application. By employing TASA, we investigate the impact of GPU randomization on reliability and performance metrics. By reproducing the results of the paper on a different GPU platform, we observe the same trend as reported in the original publication. Moreover, our preliminary results with the application of software randomization show in several cases an improved Mean Waiting Before Failure (MWBF) compared to the original source code.

Cite as

Pau López Castillón, Xavier Caricchio Hernández, and Leonidas Kosmidis. Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 4:1-4:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{castillon_et_al:OASIcs.PARMA-DITAM.2025.4,
  author =	{Castill\'{o}n, Pau L\'{o}pez and Hern\'{a}ndez, Xavier Caricchio and Kosmidis, Leonidas},
  title =	{{Towards Studying the Effect of Compiler Optimizations and Software Randomization on GPU Reliability}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{4:1--4:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.4},
  URN =		{urn:nbn:de:0030-drops-229083},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.4},
  annote =	{Keywords: Graphics processing units, reliability, software randomization, error rate}
}
Document
Evaluation of the Parallel Features of Rust for Space Systems

Authors: Alberto Perugini and Leonidas Kosmidis

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)


Abstract
The rise in complexity of the algorithms run on space systems, largely attributable to higher resolution instruments which generate a large amount of the data to be processed, as well as to the need for increased autonomy, which relies on Neural Network inference systems in future missions, demand the adoption of more powerful on-board hardware, such as multicores. At the same time, the correctness and reliability of critical on-board software is of paramount importance for the success of space missions. However, developing such complex software in low-level languages can have a negative impact on these aspects. For this reason, this paper evaluates the role that the Rust programming language can have in this change, given its memory safety and built in support for parallelism, which allows to better utilise more powerful hardware, in particular multicore cpus, without compromising the programmability and safety of the code. To this end, the GPU4S benchmarking suite, part of the open source OBPMark benchmarking suite of the European Space Agency (ESA), is ported to Rust, with sequential and parallel implementations. The performance of the ported benchmarks is compared to the existing sequential and parallel implementations in low-level languages to evaluate the trade-offs of the different solutions, and it is evaluated on several multicore platforms which are candidates for future on-board processing systems. A particular focus is put on parallel versions of the benchmarks, where Rust offers solid native support, as well as library support for fast parallelization similar to OpenMP. Finally, in terms of correctness, the Rust implementations are free of recently detected defects in the low-level implementations of the GPU4S benchmarks.

Cite as

Alberto Perugini and Leonidas Kosmidis. Evaluation of the Parallel Features of Rust for Space Systems. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{perugini_et_al:OASIcs.PARMA-DITAM.2025.5,
  author =	{Perugini, Alberto and Kosmidis, Leonidas},
  title =	{{Evaluation of the Parallel Features of Rust for Space Systems}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{5:1--5:20},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.5},
  URN =		{urn:nbn:de:0030-drops-229098},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.5},
  annote =	{Keywords: Rust, Multicore, Space Systems}
}
Document
Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier

Authors: Roger Pujol, Hamid Tabani, Leonidas Kosmidis, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla

Published in: LIPIcs, Volume 133, 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)


Abstract
Deep learning-based solutions and, in particular, deep neural networks (DNNs) are at the heart of several functionalities in critical-real time embedded systems (CRTES) from vision-based perception (object detection and tracking) systems to trajectory planning. As a result, several DNN instances simultaneously run at any time on the same computing platform. However, while modern GPUs offer a variety of computing elements (e.g. CPUs, GPUs, and specific accelerators) in which those DNN tasks can be executed depending on their computational requirements and temporal constraints, current DNNs are mainly programmed to exploit one of them, namely, regular cores in the GPU. This creates resource imbalance and under-utilization of GPU resources when executing several DNN instances, causing an increase in DNN tasks' execution time requirements. In this paper, (a) we develop different variants (implementations) of well-known DNN libraries used in the Apollo Autonomous Driving (AD) software for each of the computing elements of the latest NVIDIA Xavier SoC. Each variant can be configured to balance resource requirements and performance: the regular CPU core implementation that can run on 2, 4, and 6 cores; the GPU regular and Tensor core variants that can run in 4 or 8 GPU’s Streaming Multiprocessors (SM); and 1 or 2 NVIDIA’s Deep Learning Accelerators (NVDLA); (b) we show that each particular variant/configuration offers a different resource utilization/performance point; finally, (c) we show how those heterogeneous computing elements can be exploited by a static scheduler to sustain the execution of multiple and diverse DNN variants on the same platform.

Cite as

Roger Pujol, Hamid Tabani, Leonidas Kosmidis, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla. Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 133, pp. 23:1-23:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{pujol_et_al:LIPIcs.ECRTS.2019.23,
  author =	{Pujol, Roger and Tabani, Hamid and Kosmidis, Leonidas and Mezzetti, Enrico and Abella, Jaume and Cazorla, Francisco J.},
  title =	{{Generating and Exploiting Deep Learning Variants to Increase Heterogeneous Resource Utilization in the NVIDIA Xavier}},
  booktitle =	{31st Euromicro Conference on Real-Time Systems (ECRTS 2019)},
  pages =	{23:1--23:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-110-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{133},
  editor =	{Quinton, Sophie},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2019.23},
  URN =		{urn:nbn:de:0030-drops-107608},
  doi =		{10.4230/LIPIcs.ECRTS.2019.23},
  annote =	{Keywords: Deep Neural Network (DNN), GPU, Heterogenous Resources}
}
Document
Measurement-Based Timing Analysis of the AURIX Caches

Authors: Leonidas Kosmidis, Davide Compagnin, David Morales, Enrico Mezzetti, Eduardo Quinones, Jaume Abella, Tullio Vardanega, and Francisco J. Cazorla

Published in: OASIcs, Volume 55, 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016)


Abstract
Cache memories are one of the hardware resources with higher potential to reduce worst-case execution time (WCET) costs for software programs with tight real-time constraints. Yet, the complexity of cache analysis has caused a large fraction of real-time systems industry to avoid using them, especially in the automotive sector. For measurement-based timing analysis (MBTA) - the dominant technique in domains such as automotive - cache challenges the definition of test scenarios stressful enough to produce (cache) layouts that causing high contention. In this paper, we present our experience in enabling the use of caches for a real automotive application running on an AURIX multiprocessor, using software randomization and measurement-based probabilistic timing analysis (MBPTA). Our results show that software randomization successfully exposes - in the experiments performed for timing analysis - cache related variability, in a manner that can be effectively captured by MBPTA.

Cite as

Leonidas Kosmidis, Davide Compagnin, David Morales, Enrico Mezzetti, Eduardo Quinones, Jaume Abella, Tullio Vardanega, and Francisco J. Cazorla. Measurement-Based Timing Analysis of the AURIX Caches. In 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016). Open Access Series in Informatics (OASIcs), Volume 55, pp. 9:1-9:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{kosmidis_et_al:OASIcs.WCET.2016.9,
  author =	{Kosmidis, Leonidas and Compagnin, Davide and Morales, David and Mezzetti, Enrico and Quinones, Eduardo and Abella, Jaume and Vardanega, Tullio and Cazorla, Francisco J.},
  title =	{{Measurement-Based Timing Analysis of the AURIX Caches}},
  booktitle =	{16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016)},
  pages =	{9:1--9:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-025-5},
  ISSN =	{2190-6807},
  year =	{2016},
  volume =	{55},
  editor =	{Schoeberl, Martin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.WCET.2016.9},
  URN =		{urn:nbn:de:0030-drops-69028},
  doi =		{10.4230/OASIcs.WCET.2016.9},
  annote =	{Keywords: WCET, caches, AURIX, Automotive}
}
Document
Applying Measurement-Based Probabilistic Timing Analysis to Buffer Resources

Authors: Leonidas Kosmidis, Tullio Vardanega, Jaume Abella, Eduardo Quiñones, and Francisco J. Cazorla

Published in: OASIcs, Volume 30, 13th International Workshop on Worst-Case Execution Time Analysis (2013)


Abstract
The use of complex hardware makes it difficult for current timing analysis techniques to compute trustworthy and tight worst-case execution time (WCET) bounds. Those techniques require detailed knowledge of the internal operation and state of the platform, at both the software and hardware level. Obtaining that information for modern hardware platforms is increasingly difficult. Measurement-Based Probabilistic Timing Analysis (MBPTA) reduces the cost of acquiring the knowledge needed for computing trustworthy and tight WCET bounds. MBPTA based on Extreme Value Theory requires the execution time of processor instructions to be independent and identically distributed (i.i.d.), which can be achieved with some hardware support. Previous proposals show how those properties can be achieved for caches. This paper considers, for the first time, the implications on MBPTA of using buffer resources. Buffers in general, and first-come first-served (FCFS) buffers in particular, are of paramount importance as the complexity of hardware increases, since they allow managing contention in those resources where multiple requests may be pending. We show how buffers can be used in the context of MBPTA and provide illustrative examples.

Cite as

Leonidas Kosmidis, Tullio Vardanega, Jaume Abella, Eduardo Quiñones, and Francisco J. Cazorla. Applying Measurement-Based Probabilistic Timing Analysis to Buffer Resources. In 13th International Workshop on Worst-Case Execution Time Analysis. Open Access Series in Informatics (OASIcs), Volume 30, pp. 97-108, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)


Copy BibTex To Clipboard

@InProceedings{kosmidis_et_al:OASIcs.WCET.2013.97,
  author =	{Kosmidis, Leonidas and Vardanega, Tullio and Abella, Jaume and Qui\~{n}ones, Eduardo and Cazorla, Francisco J.},
  title =	{{Applying Measurement-Based Probabilistic Timing Analysis to Buffer Resources}},
  booktitle =	{13th International Workshop on Worst-Case Execution Time Analysis},
  pages =	{97--108},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-54-5},
  ISSN =	{2190-6807},
  year =	{2013},
  volume =	{30},
  editor =	{Maiza, Claire},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.WCET.2013.97},
  URN =		{urn:nbn:de:0030-drops-41269},
  doi =		{10.4230/OASIcs.WCET.2013.97},
  annote =	{Keywords: WCET, Buffer, Probabilistic Timing Analysis}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail