Memory Latency Distribution-Driven Regulation for Temporal Isolation in MPSoCs

Authors Ahsan Saeed , Denis Hoornaert , Dakshina Dasari, Dirk Ziegenbein, Daniel Mueller-Gritschneder , Ulf Schlichtmann , Andreas Gerstlauer , Renato Mancuso



PDF
Thumbnail PDF

File

LIPIcs.ECRTS.2023.4.pdf
  • Filesize: 1.2 MB
  • 23 pages

Document Identifiers

Author Details

Ahsan Saeed
  • Robert Bosch GmbH, Stuttgart, Germany
Denis Hoornaert
  • Technische Universität München, Germany
Dakshina Dasari
  • Robert Bosch GmbH, Stuttgart, Germany
Dirk Ziegenbein
  • Robert Bosch GmbH, Stuttgart, Germany
Daniel Mueller-Gritschneder
  • Technische Universität München, Germany
Ulf Schlichtmann
  • Technische Universität München, Germany
Andreas Gerstlauer
  • The University of Texas at Austin, TX, USA
Renato Mancuso
  • Boston University, MA, USA

Cite AsGet BibTex

Ahsan Saeed, Denis Hoornaert, Dakshina Dasari, Dirk Ziegenbein, Daniel Mueller-Gritschneder, Ulf Schlichtmann, Andreas Gerstlauer, and Renato Mancuso. Memory Latency Distribution-Driven Regulation for Temporal Isolation in MPSoCs. In 35th Euromicro Conference on Real-Time Systems (ECRTS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 262, pp. 4:1-4:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.ECRTS.2023.4

Abstract

Temporal isolation is one of the most significant challenges that must be addressed before Multi-Processor Systems-on-Chip (MPSoCs) can be widely adopted in mixed-criticality systems with both time-sensitive real-time (RT) applications and performance-oriented non-real-time (NRT) applications. Specifically, the main memory subsystem is one of the most prevalent causes of interference, performance degradation and loss of isolation. Existing memory bandwidth regulation mechanisms use static, dynamic, or predictive DRAM bandwidth management techniques to restore the execution time of an application under contention as close as possible to the execution time in isolation. In this paper, we propose a novel distribution-driven regulation whose goal is to achieve a timeliness objective formulated as a constraint on the probability of meeting a certain target execution time for the RT applications. Using existing interconnect-level Performance Monitoring Units (PMU), we can observe the Cumulative Distribution Function (CDF) of the per-request memory latency. Regulation is then triggered to enforce first-order stochastical dominance with respect to a desired reference. Consequently, it is possible to enforce that the overall observed execution time random variable is dominated by the reference execution time. The mechanism requires no prior information of the contending application and treats the DRAM subsystem as a black box. We provide a full-stack implementation of our mechanism on a Commercial Off-The-Shelf (COTS) platform (Xilinx Ultrascale+ MPSoC), evaluate it using real and synthetic benchmarks, experimentally validate that the timeliness objectives are met for the RT applications, and demonstrate that it is able to provide 2.2x more overall throughput for NRT applications compared to DRAM bandwidth management-based regulation approaches.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time systems
Keywords
  • temporal isolation
  • memory latency
  • real-time system
  • multi-core

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ankit Agrawal, Renato Mancuso, Rodolfo Pellizzoni, and Gerhard Fohler. Analysis of Dynamic Memory Bandwidth Regulation in Multi-core Real-Time Systems. In IEEE Real-Time Systems Symposium (RTSS), 2018. Google Scholar
  2. Benny Akesson, Kees Goossens, and Markus Ringhofer. Predator: A predictable SDRAM memory controller. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2007. Google Scholar
  3. ARM. An introduction to AMBA AXI. URL: https://developer.arm.com/documentation/102202.
  4. ARM. ARM® Cortex®-A53 MPCore Processor - Technical Reference Manual. URL: https://static.docs.arm.com/ddi0500/f/DDI0500.pdf.
  5. Michael Bechtel and Heechul Yun. Denial-of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2019. Google Scholar
  6. Michael Bechtel and Heechul Yun. Cache Bank-Aware Denial-of-Service Attacks on Multicore ARM Processors. In 29th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2023), San Antonio, Texas, USA, May 2023. Google Scholar
  7. Roberto Cavicchioli, Nicola Capodieci, and Marko Bertogna. Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms. In IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2017. Google Scholar
  8. Dakshina Dasari, Benny Akesson, Vincent Nélis, Muhammad Ali Awan, and Stefan M. Petters. Identifying the sources of unpredictability in COTS-based multicore systems. In IEEE International Symposium on Industrial Embedded Systems (SIES), 2013. Google Scholar
  9. Giorgio Farina, Gautam Gala, Marcello Cinque, and Gerhard Fohler. Assessing Intel’s Memory Bandwidth Allocation for resource limitation in real-time systems. In IEEE International Symposium On Real-Time Distributed Computing (ISORC), 2022. Google Scholar
  10. H. Fischer. A History of the Central Limit Theorem: From Classical to Modern Probability Theory. Sources and Studies in the History of Mathematics and Physical Sciences. Springer New York, 2010. URL: https://books.google.com/books?id=v7kTwafIiPsC.
  11. Johannes Freitag and Sascha Uhrig. Closed Loop Controller for Multicore Real-Time Systems. In Architecture of Computing Systems (ARCS), 2018. Google Scholar
  12. Sebastian Hahn, Michael Jacobs, and Jan Reineke. Enabling Compositionality for Multicore Timing Analysis. In International Conference on Real-Time Networks and Systems (RTNS), RTNS '16, 2016. Google Scholar
  13. Denis Hoornaert, Shahin Roozkhosh, and Renato Mancuso. A Memory Scheduling Infrastructure for Multi-Core Systems with Re-Programmable Logic. In Euromicro Conference on Real-Time Systems (ECRTS), 2021. Google Scholar
  14. Hyoseung Kim, Dionisio de Niz, Björn Andersson, Mark Klein, Onur Mutlu, and Ragunathan Rajkumar. Bounding memory interference delay in cots-based multi-core systems. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2014. Google Scholar
  15. J. Kiszka, V. Sinitsin, H. Schild, and contributors. Jailhouse Hypervisor. URL: https://github.com/siemens/jailhouse.
  16. Tomasz Kloda, Marco Solieri, Renato Mancuso, Nicola Capodieci, Paolo Valente, and Marko Bertogna. Deterministic memory hierarchy and virtualization for modern multi-core embedded systems. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 1-14, 2019. URL: https://doi.org/10.1109/RTAS.2019.00009.
  17. D.S. Lemons, P. Langevin, and A. Gythiel. An Introduction to Stochastic Processes in Physics. Johns Hopkins Paperback. Johns Hopkins University Press, 2002. URL: https://books.google.com/books?id=Uw6YDkd_CXcC.
  18. Claire Maiza, Hamza Rihani, Juan M. Rivas, Joël Goossens, Sebastian Altmeyer, and Robert I. Davis. A Survey of Timing Verification Techniques for Multi-Core Real-Time Systems. ACM Computing Surveys (CSUR, 52(3):1-38, 2019. Google Scholar
  19. Xiaosheng Mu, Luciano Pomatto, Philipp Strack, and Omer Tamuz. From blackwell dominance in large samples to renyi divergences and back again, 2019. URL: https://doi.org/10.48550/arXiv.1906.02838.
  20. Rodolfo Pellizzoni and Heechul Yun. Memory Servers for Multicore Systems. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2016. Google Scholar
  21. Falk Rehm, Jörg Seitter, Jan-Peter Larsson, Selma Saidi, Giovanni Stea, Raffaele Zippo, Dirk Ziegenbein, Matteo Andreozzi, and Arne Hamann. The road towards predictable automotive high - performance platforms. In Design, Automation Test in Europe Conference Exhibition (DATE), 2021. Google Scholar
  22. Shahin Roozkhosh and Renato Mancuso. The potential of programmable logic in the middle: Cache bleaching. In 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 296-309, 2020. URL: https://doi.org/10.1109/RTAS48715.2020.00006.
  23. Ahsan Saeed, Dakshina Dasari, Dirk Ziegenbein, Varun Rajasekaran, Falk Rehm, Michael Pressler, Arne Hamann, Daniel Mueller-Gritschneder, Andreas Gerstlauer, and Ulf Schlichtmann. Memory Utilization-Based Dynamic Bandwidth Regulation for Temporal Isolation in Multi-Cores. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2022. Google Scholar
  24. Gero Schwäricke, Rohan Tabish, Rodolfo Pellizzoni, Renato Mancuso, Andrea Bastoni, Alexander Zuepke, and Marco Caccamo. A Real-Time Virtio-Based Framework for Predictable Inter-VM Communication. In IEEE Real-Time Systems Symposium (RTSS), 2021. Google Scholar
  25. Alejandro Serrano-Cases, Juan M. Reina, Jaume Abella, Enrico Mezzetti, and Francisco J. Cazorla. Leveraging Hardware QoS to Control Contention in the Xilinx Zynq UltraScale+ MPSoC. In Euromicro Conference on Real-Time Systems (ECRTS), 2021. Google Scholar
  26. Moshe Shaked and J. George Shanthikumar, editors. Stochastic Orders. Springer New York, 2007. URL: https://doi.org/10.1007/978-0-387-34675-5.
  27. P. Sohal, R. Tabish, U. Drepper, and R. Mancuso. E-WarP: A System-wide Framework for Memory Bandwidth Profiling and Management. In IEEE Real-Time Systems Symposium (RTSS), 2020. Google Scholar
  28. Parul Sohal, Michael Bechtel, Renato Mancuso, Heechul Yun, and Orran Krieger. A Closer Look at Intel Resource Director Technology (RDT). In International Conference on Real-Time Networks and Systems (RTNS), 2022. Google Scholar
  29. Parul Sohal, Rohan Tabish, Ulrich Drepper, and Renato Mancuso. Profile-driven memory bandwidth management for accelerators and cpus in qos-enabled platforms. Real-Time Syst., 58(3):235-274, September 2022. URL: https://doi.org/10.1007/s11241-022-09382-x.
  30. Lukas Sommer, Florian Stock, Leonardo Solis-Vasquez, and Andreas Koch. DAPHNE - An automotive benchmark suite for parallel programming models on embedded heterogeneous platforms: work-in-progress. In International Conference on Embedded Software Companion (EMSOFT), 2019. Google Scholar
  31. Ashley Stevens. Quality of Service (QoS) in ARM Systems: An Overview. In ARM White paper, 2014. Google Scholar
  32. Meltem Sönmez Turan, Elaine Barker, John Kelsey, Kerry McKay, Mary Baish, and Michael Boyle. Recommendation for the Entropy Sources Used for Random Bit Generation, 2018. URL: https://csrc.nist.gov/publications/detail/sp/800-90b/final.
  33. P. K. Valsan, H. Yun, and F. Farshchi. Taming Non-Blocking Caches to Improve Isolation in Multicore Real-Time Systems. In IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), 2016. Google Scholar
  34. Prathap Kumar Valsan, Heechul Yun, and Farzad Farshchi. Addressing Isolation Challenges of Non-Blocking Caches for Multicore Real-Time Systems. ACM Real-Time Systems, 53(5):673-708, 2017. Google Scholar
  35. S. K. Venkata, I. Ahn, D. Jeon, A. Gupta, C. Louie, S. Garcia, S. Belongie, and M. B. Taylor. SD-VBS: The San Diego Vision Benchmark Suite. In IEEE International Symposium on Workload Characterization (IISWC), 2009. Google Scholar
  36. Bryan C. Ward, Jonathan L. Herman, Christopher J. Kenna, and James H. Anderson. Outstanding Paper Award: Making Shared Caches More Predictable on Multicore Platforms. In Euromicro Conference on Real-Time Systems (ECRTS), 2013. Google Scholar
  37. Xilinx. AXI Performance Monitor LogiCORE IP Product Guide (PG037). URL: https://docs.xilinx.com/v/u/en-US/pg172-ila.
  38. Xilinx. AXI Traffic Generator v3.0 LogiCORE IP Product Guide (PG125). URL: https://docs.xilinx.com/v/u/en-US/pg125-axi-traffic-gen.
  39. Xilinx. Integrated Logic Analyzer v6.2 LogiCORE IP Product Guide (PG172). URL: https://docs.xilinx.com/v/u/en-US/pg037_axi_perf_mon.
  40. Xilinx. Zynq UltraScale+ MPSoC ZCU102 Evaluation Kit. URL: https://www.xilinx.com/products/boards-and-kits/ek-u1-zcu102-g.html.
  41. H. Yun, G. Yao, R. Pellizzoni, M. Caccamo, and L. Sha. Memory Bandwidth Management for Efficient Performance Isolation in Multi-Core Platforms. IEEE Transactions on Computers (TC), 65(2):562-576, 2016. Google Scholar
  42. Heechul Yun, Waqar Ali, Santosh Gondi, and Siddhartha Biswas. BWLOCK: A Dynamic Memory Access Control Framework for Soft Real-Time Applications on Multicore Platforms. IEEE Transactions on Computers (TC), 66(7):1247-1252, 2017. Google Scholar
  43. Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. Towards Practical Page Coloring-Based Multicore Cache Management. In ACM European Conference on Computer Systems, EuroSys '09, 2009. Google Scholar
  44. Matteo Zini, Daniel Casini, and Alessandro Biondi. Analyzing Arm’s MPAM From the Perspective of Time Predictability. IEEE Transactions on Computers (TC), 72(1):168-182, 2023. Google Scholar
  45. Matteo Zini, Giorgiomaria Cicero, Daniel Casini, and Alessandro Biondi. Profiling and controlling I/O-related memory contention in COTS heterogeneous platforms. Software: Practice and Experience, 52(5):1095-1113, 2022. Google Scholar
  46. Alexander Zuepke, Andrea Bastoni, Weifan Chen, Marco Caccamo, and Renato Mancuso. MemPol: Policing Core Memory Bandwidth from Outside of the Cores. In 29th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2023), San Antonio, Texas, USA, May 2023. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail