DROPS

Document

Invited Paper

DOI: 10.4230/OASIcs.WCET.2024.5

Invited Paper: On the Granularity of Bandwidth Regulation in FPGA-Based Heterogeneous Systems on Chip

Authors: Gianluca Brilli, Giacomo Valente, Alessandro Capotondi, Tania Di Mascio, and Andrea Marongiu

Published in: OASIcs, Volume 121, 22nd International Workshop on Worst-Case Execution Time Analysis (WCET 2024)

Abstract

Main memory sharing in commercial, FPGA-based Heterogeneous System on Chips (HeSoCs) can cause significant interference, and ultimately severe slowdown of the executing workload, which bars the adoption of such systems in the context of time-critical applications. Bandwidth regulation approaches based on monitoring and throttling are widely adopted also in commercial hardware to improve the system quality of service (QoS), and previous work has shown that the finer the granularity of the mechanism, the more effective the QoS control. Different mechanisms, however, might exploit more or less effectively the available residual memory bandwidth, provided that the QoS requirement is satisfied. In this paper we present an exhaustive experimental evaluation of how three bandwidth regulation mechanisms with coarse, fine and ultra-fine granularity compare in terms of exploitation of the system memory bandwidth. Our results show that a very fine-grained regulation mechanism might experience worse system-level memory bandwidth exploitation compared to a coarser-grained approach.

Cite as

Gianluca Brilli, Giacomo Valente, Alessandro Capotondi, Tania Di Mascio, and Andrea Marongiu. Invited Paper: On the Granularity of Bandwidth Regulation in FPGA-Based Heterogeneous Systems on Chip. In 22nd International Workshop on Worst-Case Execution Time Analysis (WCET 2024). Open Access Series in Informatics (OASIcs), Volume 121, pp. 5:1-5:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{brilli_et_al:OASIcs.WCET.2024.5,
  author =	{Brilli, Gianluca and Valente, Giacomo and Capotondi, Alessandro and Di Mascio, Tania and Marongiu, Andrea},
  title =	{{Invited Paper: On the Granularity of Bandwidth Regulation in FPGA-Based Heterogeneous Systems on Chip}},
  booktitle =	{22nd International Workshop on Worst-Case Execution Time Analysis (WCET 2024)},
  pages =	{5:1--5:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-346-1},
  ISSN =	{2190-6807},
  year =	{2024},
  volume =	{121},
  editor =	{Carle, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.WCET.2024.5},
  URN =		{urn:nbn:de:0030-drops-204732},
  doi =		{10.4230/OASIcs.WCET.2024.5},
  annote =	{Keywords: Bandwidth Regulation, System-on-Chip, FPGA}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2024.1

JuMP2start: Time-Aware Stop-Start Technology for a Software-Defined Vehicle System

Authors: Anam Farrukh and Richard West

Published in: LIPIcs, Volume 298, 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)

Abstract

Software-defined vehicle (SDV) systems replace traditional ECU architectures with software tasks running on centralized multicore processors in automotive-grade PCs. However, PC boot delays to cold-start an integrated vehicle management system (VMS) are problematic for time-critical functions, which must process sensor and actuator data within specific time bounds. To tackle this challenge, we present JuMP2start: a time-aware multicore stop-start approach for SDVs. JuMP2start leverages PC-class suspend-to-RAM techniques to capture a system snapshot when the vehicle is stopped. Upon restart, critical services are resumed-from-RAM within order of milliseconds compared to normal cold-start times. This work showcases how JuMP2start manages global suspension and resumption mechanisms for a state-of-the-art dual-domain vehicle management system comprising real-time OS (RTOS) and Linux SMP guests. JuMP2start models automotive tasks as continuable or restartable to ensure timing- and safety-critical function pipelines are reactively resumed with low latency, while discarding stale task state. Experiments with the VMS show that critical CAN traffic processing resumes within 500 milliseconds of waking the RTOS guest, and reaches steady-state throughput in under 7ms.

Cite as

Anam Farrukh and Richard West. JuMP2start: Time-Aware Stop-Start Technology for a Software-Defined Vehicle System. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 1:1-1:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{farrukh_et_al:LIPIcs.ECRTS.2024.1,
  author =	{Farrukh, Anam and West, Richard},
  title =	{{JuMP2start: Time-Aware Stop-Start Technology for a Software-Defined Vehicle System}},
  booktitle =	{36th Euromicro Conference on Real-Time Systems (ECRTS 2024)},
  pages =	{1:1--1:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-324-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{298},
  editor =	{Pellizzoni, Rodolfo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2024.1},
  URN =		{urn:nbn:de:0030-drops-203046},
  doi =		{10.4230/LIPIcs.ECRTS.2024.1},
  annote =	{Keywords: Time-aware stop-start, Real-time power management, Suspend-to-RAM, Partitioning hypervisor, Vehicle management system, Vehicle-OS, Software-defined vehicles (SDV)}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2024.5

Shared Resource Contention in MCUs: A Reality Check and the Quest for Timeliness

Authors: Daniel Oliveira, Weifan Chen, Sandro Pinto, and Renato Mancuso

Published in: LIPIcs, Volume 298, 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)

Abstract

Microcontrollers (MCUs) are steadily embracing multi-core technology to meet growing performance demands. This trend marks a shift from their traditionally simple, deterministic designs to more complex and inherently less predictable architectures. While shared resource contention is well-studied in mid to high-end embedded systems, the emergence of multi-core architectures in MCUs introduces unique challenges and characteristics that existing research has not fully explored. In this paper, we conduct an in-depth investigation of both mainstream and next-generation MCU-based platforms, aiming to identify the sources of contention on systems typically lacking these problems. We empirically demonstrate substantial contention effects across different MCU architectures (i.e., from single- to multi-core configurations), highlighting significant application slowdowns. Notably, we observe that slowdowns can reach several orders of magnitude, with the most extreme cases showing up to a 3800x (times, not percent) increase in execution time. To address these issues, we propose and evaluate muTPArtc, a novel mechanism designed for Timely Progress Assessment (TPA) and TPA-based runtime control specifically tailored to MCUs. muTPArtc is an MCU-specialized TPA-based mechanism that leverages hardware facilities widely available in commercial off-the-shelf MCUs (i.e., hardware breakpoints and cycle counters) to successfully monitor applications' progress, detect, and mitigate timing violations. Our results demonstrate that muTPArtc effectively manages performance degradation due to interference, requiring only minimal modifications to the build pipeline and no changes to the source code of the target application, while incurring minor overheads.

Cite as

Daniel Oliveira, Weifan Chen, Sandro Pinto, and Renato Mancuso. Shared Resource Contention in MCUs: A Reality Check and the Quest for Timeliness. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 5:1-5:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{oliveira_et_al:LIPIcs.ECRTS.2024.5,
  author =	{Oliveira, Daniel and Chen, Weifan and Pinto, Sandro and Mancuso, Renato},
  title =	{{Shared Resource Contention in MCUs: A Reality Check and the Quest for Timeliness}},
  booktitle =	{36th Euromicro Conference on Real-Time Systems (ECRTS 2024)},
  pages =	{5:1--5:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-324-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{298},
  editor =	{Pellizzoni, Rodolfo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2024.5},
  URN =		{urn:nbn:de:0030-drops-203088},
  doi =		{10.4230/LIPIcs.ECRTS.2024.5},
  annote =	{Keywords: multi-core microcontrollers, shared resources contention, progress-aware regulation}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2024.7

The Omnivisor: A Real-Time Static Partitioning Hypervisor Extension for Heterogeneous Core Virtualization over MPSoCs

Authors: Daniele Ottaviano, Francesco Ciraolo, Renato Mancuso, and Marcello Cinque

Published in: LIPIcs, Volume 298, 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)

Abstract

Following the needs of industrial applications, virtualization has emerged as one of the most effective approaches for the consolidation of mixed-criticality systems while meeting tight constraints in terms of space, weight, power, and cost (SWaP-C). In embedded platforms with homogeneous processors, a wealth of works have proposed designs and techniques to enforce spatio-temporal isolation by leveraging well-understood virtualization support. Unfortunately, achieving the same goal on heterogeneous MultiProcessor Systems-on-Chip (MPSoCs) has been largely overlooked. Modern hypervisors are designed to operate exclusively on main cores, with little or no consideration given to other co-processors within the system, such as small microcontroller-level CPUs or soft-cores deployed on programmable logic (FPGA). Typically, hypervisors consider co-processors as I/O devices allocated to virtual machines that run on primary cores, yielding full control and responsibility over them. Nevertheless, inadequate management of these resources can lead to spatio-temporal isolation issues within the system. In this paper, we propose the Omnivisor model as a paradigm for the holistic management of heterogeneous platforms. The model generalizes the features of real-time static partitioning hypervisors to enable the execution of virtual machines on processors with different Instruction Set Architectures (ISAs) within the same MPSoC. Moreover, the Omnivisor ensures temporal and spatial isolation between virtual machines by integrating and leveraging a variety of hardware and software protection mechanisms. The presented approach not only expands the scope of virtualization in MPSoCs but also enhances the overall system reliability and real-time performance for mixed-criticality applications. A full open-source reference implementation of the Omnivisor based on the Jailhouse hypervisor is provided, targeting ARM real-time processing units and RISC-V soft-cores on FPGA. Experimental results on real hardware show the benefits of the solution, including enabling the seamless launch of virtual machines on different ISAs and extending spatial/temporal isolation to heterogenous cores with enhanced regulation policies.

Cite as

Daniele Ottaviano, Francesco Ciraolo, Renato Mancuso, and Marcello Cinque. The Omnivisor: A Real-Time Static Partitioning Hypervisor Extension for Heterogeneous Core Virtualization over MPSoCs. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 7:1-7:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{ottaviano_et_al:LIPIcs.ECRTS.2024.7,
  author =	{Ottaviano, Daniele and Ciraolo, Francesco and Mancuso, Renato and Cinque, Marcello},
  title =	{{The Omnivisor: A Real-Time Static Partitioning Hypervisor Extension for Heterogeneous Core Virtualization over MPSoCs}},
  booktitle =	{36th Euromicro Conference on Real-Time Systems (ECRTS 2024)},
  pages =	{7:1--7:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-324-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{298},
  editor =	{Pellizzoni, Rodolfo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2024.7},
  URN =		{urn:nbn:de:0030-drops-203107},
  doi =		{10.4230/LIPIcs.ECRTS.2024.7},
  annote =	{Keywords: Mixed-Criticality, Embedded Virtualization, Real-Time Systems, MPSoCs}
}

Document

Artifact

DOI: 10.4230/DARTS.10.1.4

The Omnivisor: A real-time static partitioning hypervisor extension for heterogeneous core virtualization over MPSoCs (Artifact)

Authors: Daniele Ottaviano, Francesco Ciraolo, Renato Mancuso, and Marcello Cinque

Published in: DARTS, Volume 10, Issue 1, Special Issue of the 36th Euromicro Conference on Real-Time Systems (ECRTS 2024)

Abstract

Following the needs of industrial applications, virtualization has emerged as one of the most effective approaches for the consolidation of mixed-criticality systems while meeting tight constraints in terms of space, weight, power, and cost (SWaP-C). In embedded platforms with homogeneous processors, a wealth of works have proposed designs and techniques to enforce spatio-temporal isolation by leveraging well-understood virtualization support. Unfortunately, achieving the same goal on heterogeneous MultiProcessor Systems-on-Chip (MPSoCs) has been largely overlooked. Modern hypervisors are designed to operate exclusively on main cores, with little or no consideration given to other co-processors within the system, such as small microcontroller-level CPUs or soft-cores deployed on programmable logic (FPGA). Typically, hypervisors consider co-processors as I/O devices allocated to virtual machines that run on primary cores, yielding full control and responsibility over them. Nevertheless, inadequate management of these resources can lead to spatio-temporal isolation issues within the system. In this paper, we propose the Omnivisor model as a paradigm for the holistic management of heterogeneous platforms. The model generalizes the features of real-time static partitioning hypervisors to enable the execution of virtual machines on processors with different Instruction Set Architectures (ISAs) within the same MPSoC. Moreover, the Omnivisor ensures temporal and spatial isolation between virtual machines by integrating and leveraging a variety of hardware and software protection mechanisms. The presented approach not only expands the scope of virtualization in MPSoCs but also enhances the overall system reliability and real-time performance for mixed-criticality applications. A full open-source reference implementation of the Omnivisor based on the Jailhouse hypervisor is provided, targeting ARM real-time processing units and RISC-V soft-cores on FPGA. Experimental results on real hardware show the benefits of the solution, in terms of the seamless launch of virtual machines on different ISAs, and spatial/temporal isolation, enhanced with regulation policies.

Cite as

Daniele Ottaviano, Francesco Ciraolo, Renato Mancuso, and Marcello Cinque. The Omnivisor: A real-time static partitioning hypervisor extension for heterogeneous core virtualization over MPSoCs (Artifact). In Special Issue of the 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Dagstuhl Artifacts Series (DARTS), Volume 10, Issue 1, pp. 4:1-4:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{ottaviano_et_al:DARTS.10.1.4,
  author =	{Ottaviano, Daniele and Ciraolo, Francesco and Mancuso, Renato and Cinque, Marcello},
  title =	{{The Omnivisor: A real-time static partitioning hypervisor extension for heterogeneous core virtualization over MPSoCs (Artifact)}},
  pages =	{4:1--4:7},
  journal =	{Dagstuhl Artifacts Series},
  ISBN =	{978-3-95977-327-0},
  ISSN =	{2509-8195},
  year =	{2024},
  volume =	{10},
  number =	{1},
  editor =	{Ottaviano, Daniele and Ciraolo, Francesco and Mancuso, Renato and Cinque, Marcello},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DARTS.10.1.4},
  URN =		{urn:nbn:de:0030-drops-203267},
  doi =		{10.4230/DARTS.10.1.4},
  annote =	{Keywords: Mixed-Criticality, Embedded Virtualization, Real-Time Systems, MPSoCs.}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2023.4

Memory Latency Distribution-Driven Regulation for Temporal Isolation in MPSoCs

Authors: Ahsan Saeed, Denis Hoornaert, Dakshina Dasari, Dirk Ziegenbein, Daniel Mueller-Gritschneder, Ulf Schlichtmann, Andreas Gerstlauer, and Renato Mancuso

Published in: LIPIcs, Volume 262, 35th Euromicro Conference on Real-Time Systems (ECRTS 2023)

Abstract

Temporal isolation is one of the most significant challenges that must be addressed before Multi-Processor Systems-on-Chip (MPSoCs) can be widely adopted in mixed-criticality systems with both time-sensitive real-time (RT) applications and performance-oriented non-real-time (NRT) applications. Specifically, the main memory subsystem is one of the most prevalent causes of interference, performance degradation and loss of isolation. Existing memory bandwidth regulation mechanisms use static, dynamic, or predictive DRAM bandwidth management techniques to restore the execution time of an application under contention as close as possible to the execution time in isolation. In this paper, we propose a novel distribution-driven regulation whose goal is to achieve a timeliness objective formulated as a constraint on the probability of meeting a certain target execution time for the RT applications. Using existing interconnect-level Performance Monitoring Units (PMU), we can observe the Cumulative Distribution Function (CDF) of the per-request memory latency. Regulation is then triggered to enforce first-order stochastical dominance with respect to a desired reference. Consequently, it is possible to enforce that the overall observed execution time random variable is dominated by the reference execution time. The mechanism requires no prior information of the contending application and treats the DRAM subsystem as a black box. We provide a full-stack implementation of our mechanism on a Commercial Off-The-Shelf (COTS) platform (Xilinx Ultrascale+ MPSoC), evaluate it using real and synthetic benchmarks, experimentally validate that the timeliness objectives are met for the RT applications, and demonstrate that it is able to provide 2.2x more overall throughput for NRT applications compared to DRAM bandwidth management-based regulation approaches.

Cite as

Ahsan Saeed, Denis Hoornaert, Dakshina Dasari, Dirk Ziegenbein, Daniel Mueller-Gritschneder, Ulf Schlichtmann, Andreas Gerstlauer, and Renato Mancuso. Memory Latency Distribution-Driven Regulation for Temporal Isolation in MPSoCs. In 35th Euromicro Conference on Real-Time Systems (ECRTS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 262, pp. 4:1-4:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{saeed_et_al:LIPIcs.ECRTS.2023.4,
  author =	{Saeed, Ahsan and Hoornaert, Denis and Dasari, Dakshina and Ziegenbein, Dirk and Mueller-Gritschneder, Daniel and Schlichtmann, Ulf and Gerstlauer, Andreas and Mancuso, Renato},
  title =	{{Memory Latency Distribution-Driven Regulation for Temporal Isolation in MPSoCs}},
  booktitle =	{35th Euromicro Conference on Real-Time Systems (ECRTS 2023)},
  pages =	{4:1--4:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-280-8},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{262},
  editor =	{Papadopoulos, Alessandro V.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2023.4},
  URN =		{urn:nbn:de:0030-drops-180339},
  doi =		{10.4230/LIPIcs.ECRTS.2023.4},
  annote =	{Keywords: temporal isolation, memory latency, real-time system, multi-core}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2023.13

Low-Overhead Online Assessment of Timely Progress as a System Commodity

Authors: Weifan Chen, Ivan Izhbirdeev, Denis Hoornaert, Shahin Roozkhosh, Patrick Carpanedo, Sanskriti Sharma, and Renato Mancuso

Published in: LIPIcs, Volume 262, 35th Euromicro Conference on Real-Time Systems (ECRTS 2023)

Abstract

The correctness of safety-critical systems depends on both their logical and temporal behavior. Control-flow integrity (CFI) is a well-established and understood technique to safeguard the logical flow of safety-critical applications. But unfortunately, no established methodologies exist for the complementary problem of detecting violations of control flow timeliness. Worse yet, the latter dimension, which we term Timely Progress Integrity (TPI), is increasingly more jeopardized as the complexity of our embedded systems continues to soar. As key resources of the memory hierarchy become shared by several CPUs and accelerators, they become hard-to-analyze performance bottlenecks. And the precise interplay between software and hardware components becomes hard to predict and reason about. How to restore control over timely progress integrity? We postulate that the first stepping stone toward TPI is to develop methodologies for Timely Progress Assessment (TPA). TPA refers to the ability of a system to live-monitor the positive/negative slack - with respect to a known reference - at key milestones throughout an application’s lifespan. In this paper, we propose one such methodology that goes under the name of Milestone-Based Timely Progress Assessment or MB-TPA, for short. Among the key design principles of MB-TPA is the ability to operate on black-box binary executables with near-zero time overhead and implementable on commercial platforms. To prove its feasibility and effectiveness, we propose and evaluate a full-stack implementation called Timely Progress Assessment with 0 Overhead (TPAw0v). We demonstrate its capability in providing live TPA for complex vision applications while introducing less than 0.6% time overhead for applications under test. Finally, we demonstrate one use case where TPA information is used to restore TPI in the presence of temporal interference over shared memory resources.

Cite as

Weifan Chen, Ivan Izhbirdeev, Denis Hoornaert, Shahin Roozkhosh, Patrick Carpanedo, Sanskriti Sharma, and Renato Mancuso. Low-Overhead Online Assessment of Timely Progress as a System Commodity. In 35th Euromicro Conference on Real-Time Systems (ECRTS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 262, pp. 13:1-13:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ECRTS.2023.13,
  author =	{Chen, Weifan and Izhbirdeev, Ivan and Hoornaert, Denis and Roozkhosh, Shahin and Carpanedo, Patrick and Sharma, Sanskriti and Mancuso, Renato},
  title =	{{Low-Overhead Online Assessment of Timely Progress as a System Commodity}},
  booktitle =	{35th Euromicro Conference on Real-Time Systems (ECRTS 2023)},
  pages =	{13:1--13:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-280-8},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{262},
  editor =	{Papadopoulos, Alessandro V.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2023.13},
  URN =		{urn:nbn:de:0030-drops-180428},
  doi =		{10.4230/LIPIcs.ECRTS.2023.13},
  annote =	{Keywords: progress-aware regulation, hardware assisted runtime monitoring, timing annotation, control flow graph}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2021.2

A Memory Scheduling Infrastructure for Multi-Core Systems with Re-Programmable Logic

Authors: Denis Hoornaert, Shahin Roozkhosh, and Renato Mancuso

Published in: LIPIcs, Volume 196, 33rd Euromicro Conference on Real-Time Systems (ECRTS 2021)

Abstract

The sharp increase in demand for performance has prompted an explosion in the complexity of modern multi-core embedded systems. This has lead to unprecedented temporal unpredictability concerns in Cyber-Physical Systems (CPS). On-chip integration of programmable logic (PL) alongside a conventional Processing System (PS) in modern Systems-on-Chip (SoC) establishes a genuine compromise between specialization, performance, and reconfigurability. In addition to typical use-cases, it has been shown that the PL can be used to observe, manipulate, and ultimately manage memory traffic generated by a traditional multi-core processor. This paper explores the possibility of PL-aided memory scheduling by proposing a Scheduler In-the-Middle (SchIM). We demonstrate that the SchIM enables transaction-level control over the main memory traffic generated by a set of embedded cores. Focusing on extensibility and reconfigurability, we put forward a SchIM design covering two main objectives. First, to provide a safe playground to test innovative memory scheduling mechanisms; and second, to establish a transition path from software-based memory regulation to provably correct hardware-enforced memory scheduling. We evaluate our design through a full-system implementation on a commercial PS-PL platform using synthetic and real-world benchmarks.

Cite as

Denis Hoornaert, Shahin Roozkhosh, and Renato Mancuso. A Memory Scheduling Infrastructure for Multi-Core Systems with Re-Programmable Logic. In 33rd Euromicro Conference on Real-Time Systems (ECRTS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 196, pp. 2:1-2:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{hoornaert_et_al:LIPIcs.ECRTS.2021.2,
  author =	{Hoornaert, Denis and Roozkhosh, Shahin and Mancuso, Renato},
  title =	{{A Memory Scheduling Infrastructure for Multi-Core Systems with Re-Programmable Logic}},
  booktitle =	{33rd Euromicro Conference on Real-Time Systems (ECRTS 2021)},
  pages =	{2:1--2:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-192-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{196},
  editor =	{Brandenburg, Bj\"{o}rn B.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2021.2},
  URN =		{urn:nbn:de:0030-drops-139331},
  doi =		{10.4230/LIPIcs.ECRTS.2021.2},
  annote =	{Keywords: Memory Scheduling, PLIM, FPGA, Memory Management, Bandwidth Regulation, MemGuard, Coloring, Bank Partitioning, Real-time, Multicore, Safety-critical}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2021.4

Governing with Insights: Towards Profile-Driven Cache Management of Black-Box Applications

Authors: Golsana Ghaemi, Dharmesh Tarapore, and Renato Mancuso

Published in: LIPIcs, Volume 196, 33rd Euromicro Conference on Real-Time Systems (ECRTS 2021)

Abstract

There exists a divide between the ever-increasing demand for high-performance embedded systems and the availability of practical methodologies to understand the interplay of complex data-intensive applications with hardware memory resources. On the one hand, traditional static analysis approaches are seldomly applicable to latest-generation multi-core platforms due to a lack of accurate micro-architectural models. On the other hand, measurement-based methods only provide coarse-grained information about the end-to-end execution of a given real-time application. In this paper, we describe a novel methodology, namely Black-Box Profiling (BBProf), to gather fine-grained insights on the usage of cache resources in applications of realistic complexity. The goal of our technique is to extract the relative importance of individual memory pages towards the overall temporal behavior of a target application. Importantly, BBProf does not require the semantics of the target application to be known - i.e., applications are treated as black-boxes - and it does not rely on any platform-specific hardware support. We provide an open-source full-system implementation and showcase how BBProf can be used to perform profile-driven cache management.

Cite as

Golsana Ghaemi, Dharmesh Tarapore, and Renato Mancuso. Governing with Insights: Towards Profile-Driven Cache Management of Black-Box Applications. In 33rd Euromicro Conference on Real-Time Systems (ECRTS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 196, pp. 4:1-4:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{ghaemi_et_al:LIPIcs.ECRTS.2021.4,
  author =	{Ghaemi, Golsana and Tarapore, Dharmesh and Mancuso, Renato},
  title =	{{Governing with Insights: Towards Profile-Driven Cache Management of Black-Box Applications}},
  booktitle =	{33rd Euromicro Conference on Real-Time Systems (ECRTS 2021)},
  pages =	{4:1--4:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-192-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{196},
  editor =	{Brandenburg, Bj\"{o}rn B.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2021.4},
  URN =		{urn:nbn:de:0030-drops-139359},
  doi =		{10.4230/LIPIcs.ECRTS.2021.4},
  annote =	{Keywords: Cache Profiling, WSS Estimation, Cache Interference, Real-time, Multicore, Contention-induced Instruction Stall, C2IS, Coloring, Cache Management, Cacheability}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2019.17

Impact of DM-LRU on WCET: A Static Analysis Approach

Authors: Renato Mancuso, Heechul Yun, and Isabelle Puaut

Published in: LIPIcs, Volume 133, 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)

Abstract

Cache memories in modern embedded processors are known to improve average memory access performance. Unfortunately, they are also known to represent a major source of unpredictability for hard real-time workload. One of the main limitations of typical caches is that content selection and replacement is entirely performed in hardware. As such, it is hard to control the cache behavior in software to favor caching of blocks that are known to have an impact on an application’s worst-case execution time (WCET). In this paper, we consider a cache replacement policy, namely DM-LRU, that allows system designers to prioritize caching of memory blocks that are known to have an important impact on an application’s WCET. Considering a single-core, single-level cache hierarchy, we describe an abstract interpretation-based timing analysis for DM-LRU. We implement the proposed analysis in a self-contained toolkit and study its qualitative properties on a set of representative benchmarks. Apart from being useful to compute the WCET when DM-LRU or similar policies are used, the proposed analysis can allow designers to perform WCET impact-aware selection of content to be retained in cache.

Cite as

Renato Mancuso, Heechul Yun, and Isabelle Puaut. Impact of DM-LRU on WCET: A Static Analysis Approach. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 133, pp. 17:1-17:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{mancuso_et_al:LIPIcs.ECRTS.2019.17,
  author =	{Mancuso, Renato and Yun, Heechul and Puaut, Isabelle},
  title =	{{Impact of DM-LRU on WCET: A Static Analysis Approach}},
  booktitle =	{31st Euromicro Conference on Real-Time Systems (ECRTS 2019)},
  pages =	{17:1--17:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-110-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{133},
  editor =	{Quinton, Sophie},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2019.17},
  URN =		{urn:nbn:de:0030-drops-107546},
  doi =		{10.4230/LIPIcs.ECRTS.2019.17},
  annote =	{Keywords: real-time, static cache analysis, abstract interpretation, LRU, deterministic memory, static cache locking, dynamic cache locking, cache profiling, WCET analysis}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2019.27

Designing Mixed Criticality Applications on Modern Heterogeneous MPSoC Platforms

Authors: Giovani Gracioli, Rohan Tabish, Renato Mancuso, Reza Mirosanlou, Rodolfo Pellizzoni, and Marco Caccamo

Published in: LIPIcs, Volume 133, 31st Euromicro Conference on Real-Time Systems (ECRTS 2019)

Abstract

Multiprocessor Systems-on-Chip (MPSoC) integrating hard processing cores with programmable logic (PL) are becoming increasingly common. While these platforms have been originally designed for high performance computing applications, their rich feature set can be exploited to efficiently implement mixed criticality domains serving both critical hard real-time tasks, as well as soft real-time tasks. In this paper, we take a deep look at commercially available heterogeneous MPSoCs that incorporate PL and a multicore processor. We show how one can tailor these processors to support a mixed criticality system, where cores are strictly isolated to avoid contention on shared resources such as Last-Level Cache (LLC) and main memory. In order to avoid conflicts in last-level cache, we propose the use of cache coloring, implemented in the Jailhouse hypervisor. In addition, we employ ScratchPad Memory (SPM) inside the PL to support a multi-phase execution model for real-time tasks that avoids conflicts in shared memory. We provide a full-stack, working implementation on a latest-generation MPSoC platform, and show results based on both a set of data intensive tasks, as well as a case study based on an image processing benchmark application.

Cite as

Giovani Gracioli, Rohan Tabish, Renato Mancuso, Reza Mirosanlou, Rodolfo Pellizzoni, and Marco Caccamo. Designing Mixed Criticality Applications on Modern Heterogeneous MPSoC Platforms. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 133, pp. 27:1-27:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{gracioli_et_al:LIPIcs.ECRTS.2019.27,
  author =	{Gracioli, Giovani and Tabish, Rohan and Mancuso, Renato and Mirosanlou, Reza and Pellizzoni, Rodolfo and Caccamo, Marco},
  title =	{{Designing Mixed Criticality Applications on Modern Heterogeneous MPSoC Platforms}},
  booktitle =	{31st Euromicro Conference on Real-Time Systems (ECRTS 2019)},
  pages =	{27:1--27:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-110-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{133},
  editor =	{Quinton, Sophie},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2019.27},
  URN =		{urn:nbn:de:0030-drops-107645},
  doi =		{10.4230/LIPIcs.ECRTS.2019.27},
  annote =	{Keywords: Mixed-criticality systems, SoC Heterogeneous platforms, FPGA, real-time computing}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2018.1

Deterministic Memory Abstraction and Supporting Multicore System Architecture

Authors: Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, and Heechul Yun

Published in: LIPIcs, Volume 106, 30th Euromicro Conference on Real-Time Systems (ECRTS 2018)

Abstract

Poor time predictability of multicore processors has been a long-standing challenge in the real-time systems community. In this paper, we make a case that a fundamental problem that prevents efficient and predictable real-time computing on multicore is the lack of a proper memory abstraction to express memory criticality, which cuts across various layers of the system: the application, OS, and hardware. We, therefore, propose a new holistic resource management approach driven by a new memory abstraction, which we call Deterministic Memory. The key characteristic of deterministic memory is that the platform-the OS and hardware-guarantees small and tightly bounded worst-case memory access timing. In contrast, we call the conventional memory abstraction as best-effort memory in which only highly pessimistic worst-case bounds can be achieved. We propose to utilize both abstractions to achieve high time predictability but without significantly sacrificing performance. We present deterministic memory-aware OS and architecture designs, including OS-level page allocator, hardware-level cache, and DRAM controller designs. We implement the proposed OS and architecture extensions on Linux and gem5 simulator. Our evaluation results, using a set of synthetic and real-world benchmarks, demonstrate the feasibility and effectiveness of our approach.

Cite as

Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, and Heechul Yun. Deterministic Memory Abstraction and Supporting Multicore System Architecture. In 30th Euromicro Conference on Real-Time Systems (ECRTS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 106, pp. 1:1-1:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{farshchi_et_al:LIPIcs.ECRTS.2018.1,
  author =	{Farshchi, Farzad and Valsan, Prathap Kumar and Mancuso, Renato and Yun, Heechul},
  title =	{{Deterministic Memory Abstraction and Supporting Multicore System Architecture}},
  booktitle =	{30th Euromicro Conference on Real-Time Systems (ECRTS 2018)},
  pages =	{1:1--1:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-075-0},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{106},
  editor =	{Altmeyer, Sebastian},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2018.1},
  URN =		{urn:nbn:de:0030-drops-90010},
  doi =		{10.4230/LIPIcs.ECRTS.2018.1},
  annote =	{Keywords: multicore processors, real-time, shared cache, DRAM controller, Linux}
}

Document

DOI: 10.4230/LIPIcs.ECRTS.2017.3

WCET Derivation under Single Core Equivalence with Explicit Memory Budget Assignment

Authors: Renato Mancuso, Rodolfo Pellizzoni, Neriman Tokcan, and Marco Caccamo

Published in: LIPIcs, Volume 76, 29th Euromicro Conference on Real-Time Systems (ECRTS 2017)

Abstract

In the last decade there has been a steady uptrend in the popularity of embedded multi-core platforms. This represents a turning point in the theory and implementation of real-time systems. From a real-time standpoint, however, the extensive sharing of hardware resources (e.g. caches, DRAM subsystem, I/O channels) represents a major source of unpredictability. Budget-based memory regulation (throttling) has been extensively studied to enforce a strict partitioning of the DRAM subsystem’s bandwidth. The common approach to analyze a task under memory bandwidth regulation is to consider the budget of the core where the task is executing, and assume the worst-case about the remaining cores' budgets. In this work, we propose a novel analysis strategy to derive the WCET of a task under memory bandwidth regulation that takes into account the exact distribution of memory budgets to cores. In this sense, the proposed analysis represents a generalization of approaches that consider (i) even budget distribution across cores; and (ii) uneven but unknown (except for the core under analysis) budget assignment. By exploiting the additional piece of information, we show that it is possible to derive a more accurate WCET estimation. Our evaluations highlight that the proposed technique can reduce overestimation by 30% in average, and up to 60%, compared to the state of the art.

Cite as

Renato Mancuso, Rodolfo Pellizzoni, Neriman Tokcan, and Marco Caccamo. WCET Derivation under Single Core Equivalence with Explicit Memory Budget Assignment. In 29th Euromicro Conference on Real-Time Systems (ECRTS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 76, pp. 3:1-3:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{mancuso_et_al:LIPIcs.ECRTS.2017.3,
  author =	{Mancuso, Renato and Pellizzoni, Rodolfo and Tokcan, Neriman and Caccamo, Marco},
  title =	{{WCET Derivation under Single Core Equivalence with Explicit Memory Budget Assignment}},
  booktitle =	{29th Euromicro Conference on Real-Time Systems (ECRTS 2017)},
  pages =	{3:1--3:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-037-8},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{76},
  editor =	{Bertogna, Marko},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECRTS.2017.3},
  URN =		{urn:nbn:de:0030-drops-71684},
  doi =		{10.4230/LIPIcs.ECRTS.2017.3},
  annote =	{Keywords: real-time multicore, WCET, single-core equivalence, DRAM management, certification}
}

13 Search Results for "Mancuso, Renato"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as