ePAPI: Performance Application Programming Interface for Embedded Platforms

Authors Jeremy Giesen , Enrico Mezzetti , Jaume Abella , Enrique Fernández, Francisco J. Cazorla

Thumbnail PDF


  • Filesize: 0.49 MB
  • 13 pages

Document Identifiers

Author Details

Jeremy Giesen
  • Universitat Politècnica de Catalunya, Spain
  • Barcelona Supercomputing Center, Spain
Enrico Mezzetti
  • Barcelona Supercomputing Center, Spain
Jaume Abella
  • Barcelona Supercomputing Center, Spain
Enrique Fernández
  • Universidad de Las Palmas de Gran Canaria, Spain
Francisco J. Cazorla
  • Barcelona Supercomputing Center, Spain

Cite AsGet BibTex

Jeremy Giesen, Enrico Mezzetti, Jaume Abella, Enrique Fernández, and Francisco J. Cazorla. ePAPI: Performance Application Programming Interface for Embedded Platforms. In 19th International Workshop on Worst-Case Execution Time Analysis (WCET 2019). Open Access Series in Informatics (OASIcs), Volume 72, pp. 3:1-3:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Performance Monitoring Counters (PMCs) have been traditionally used in the mainstream computing domain to perform debugging and optimization of software performance. PMCs are increasingly considered in embedded time-critical domains to collect in-depth information, e.g. cache misses and memory accesses, of software execution time on complex multicore platforms. In main-stream platforms, standardized specifications and applications like the Performance Application Programming Interface (PAPI) and perf have been proposed to deal with variable PMC support across platforms, by providing a shared interface for configuring and collecting traceable events. However, no equivalent solution exists for embedded critical processors for which the user is required to deal with low-level, platform-specific, and error-prone manipulation of PMC registers. In this paper, we address the need for a standardized PMC interface in the embedded domain, especially in view to support timing characterization of embedded platforms. We assess the compatibility of the PAPI interface with the PMC support available on the AURIX TC297, a reference automotive platform, and we implement and validate ePAPI, the first functionally-equivalent and low-overhead implementation of PAPI for the considered embedded platform.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Embedded software
  • Computer systems organization → Real-time systems
  • Monitoring counters
  • embedded systems


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. perf: Linux profiling with performance counters. URL: https://perf.wiki.kernel.org/index.php/Main_Page.
  2. Jaume Abella, Carles Hernández, Eduardo Quiñones, Francisco J. Cazorla, Philippa Ryan Conmy, Mikel Azkarate-askasua, Jon Pérez, Enrico Mezzetti, and Tullio Vardanega. WCET analysis methods: Pitfalls and challenges on their trustworthiness. In 10th IEEE International Symposium on Industrial Embedded Systems, SIES 2015, Siegen, Germany, June 8-10, 2015, pages 39-48. IEEE, 2015. URL: http://dx.doi.org/10.1109/SIES.2015.7185039.
  3. Airbus. Global Networks, Global Citizens. 2018-2037. Global Market Forecast. https://www.airbus.com/aircraft/market/global-market-forecast.html, 2018.
  4. ARM. ARM Expects Vehicle Compute Performance to Increase 100x in Next Decade. https://www.arm.com/company/news/2015/04/arm-expects-vehicle-compute-performance-to-increase-100x-in-next-decade, 2015.
  5. Dakshina Dasari, Vincent Nélis, and Benny Akesson. A framework for memory contention analysis in multi-core platforms. Real-Time Systems, 52(3):272-322, 2016. URL: http://dx.doi.org/10.1007/s11241-015-9229-9.
  6. Enrique Díaz, Enrico Mezzetti, Leonidas Kosmidis, Jaume Abella, and Francisco J. Cazorla. Modelling multicore contention on the AURIX^TM TC27x. In Proceedings of the 55th Annual Design Automation Conference, DAC 2018, San Francisco, CA, USA, June 24-29, 2018, pages 97:1-97:6. ACM, 2018. URL: http://dx.doi.org/10.1145/3195970.3196077.
  7. Evidence. Erika Enterprise RTOS v3. http://www.erika-enterprise.com/, 2019.
  8. Andreas Hansson, Kees Goossens, Marco Bekooij, and Jos Huisken. CoMPSoC: A template for composable and predictable multi-processor system on chips. ACM Trans. Design Autom. Electr. Syst., 14(1):2:1-2:24, 2009. URL: http://dx.doi.org/10.1145/1455229.1455231.
  9. Carles Hernández, Jaume Abella, Francisco J. Cazorla, Alen Bardizbanyan, Jan Andersson, Fabrice Cros, and Franck Wartel. Design and Implementation of a Time Predictable Processor: Evaluation With a Space Case Study. In Marko Bertogna, editor, 29th Euromicro Conference on Real-Time Systems, ECRTS 2017, June 27-30, 2017, Dubrovnik, Croatia, volume 76 of LIPIcs, pages 16:1-16:23. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017. URL: http://dx.doi.org/10.4230/LIPIcs.ECRTS.2017.16.
  10. Rafia Inam, Mikael Sjödin, and Marcus Jägemar. Bandwidth measurement using performance counters for predictable multicore software. In Proceedings of 2012 IEEE 17th International Conference on Emerging Technologies & Factory Automation, ETFA 2012, Krakow, Poland, September 17-21, 2012, pages 1-4. IEEE, 2012. URL: http://dx.doi.org/10.1109/ETFA.2012.6489714.
  11. Infineon. AURIXtrademark TC29x B-Step 32-Bit Single-Chip Microcontroller - User’s Manual V1.3 2014-12. 2019. Google Scholar
  12. Kyeong-Jae Lee and Kevin Skadron. Using Performance Counters for Runtime Temperature Sensing in High-Performance Processors. In 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), CD-ROM / Abstracts Proceedings, 4-8 April 2005, Denver, CO, USA. IEEE Computer Society, 2005. URL: http://dx.doi.org/10.1109/IPDPS.2005.448.
  13. Kevin London, Shirley Moore, Phil Mucci, Keith Seymour, and Richard Luczak. The PAPI Cross-Platform Interface to Hardware Performance Counters. In Department of Defense Usersquoteright Group Conference Proceedings, pages 18-21, Biloxi, Mississippi, June 2001. Google Scholar
  14. Enrico Mezzetti, Leonidas Kosmidis, Jaume Abella, and Francisco J. Cazorla. High-Integrity Performance Monitoring Units in Automotive Chips for Reliable Timing V&V. IEEE Micro, 38(1):56-65, 2018. URL: http://dx.doi.org/10.1109/MM.2018.112130235.
  15. Jan Nowotsch, Michael Paulitsch, Daniel Buhler, Henrik Theiling, Simon Wegener, and Michael Schmidt. Multi-core Interference-Sensitive WCET Analysis Leveraging Runtime Resource Capacity Enforcement. In 26th Euromicro Conference on Real-Time Systems, ECRTS 2014, Madrid, Spain, July 8-11, 2014, pages 109-118. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/ECRTS.2014.20.
  16. NXP/Freescle. e6500 Core Reference Manual - E6500RM Rev 0 06/2014. 2014. Google Scholar
  17. Martin Schoeberl, Sahar Abbaspour, Benny Akesson, Neil Audsley, Raffaele Capasso, Jamie Garside, Kees Goossens, Sven Goossens, Scott Hansen, Reinhold Heckmann, Stefan Hepp, Benedikt Huber, Alexander Jordan, Evangelia Kasapaki, Jens Knoop, Yonghui Li, Daniel Prokesch, Wolfgang Puffitsch, Peter Puschner, André Rocha, Cláudio Silva, Jens Sparsø, and Alessandro Tocchi. T-CREST: Time-predictable multi-core architecture for embedded systems. Journal of Systems Architecture, 61(9):449-471, 2015. URL: http://dx.doi.org/10.1016/j.sysarc.2015.04.002.
  18. Young Wn Song and Yann-Hang Lee. On the existence of probe effect in multi-threaded embedded programs. In Tulika Mitra and Jan Reineke, editors, 2014 International Conference on Embedded Software, EMSOFT 2014, New Delhi, India, October 12-17, 2014, pages 18:1-18:9. ACM, 2014. URL: http://dx.doi.org/10.1145/2656045.2656062.
  19. Theo Ungerer, Francisco J. Cazorla, Pascal Sainrat, Guillem Bernat, Zlatko Petrov, Christine Rochange, Eduardo Quiñones, Mike Gerdes, Marco Paolieri, Julian Wolf, Hugues Cassé, Sascha Uhrig, Irakli Guliashvili, Michael Houston, Florian Kluge, Stefan Metzlaff, and Jörg Mische. Merasa: Multicore Execution of Hard Real-Time Applications Supporting Analyzability. IEEE Micro, 30(5):66-75, 2010. URL: http://dx.doi.org/10.1109/MM.2010.78.
  20. Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In 19th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS 2013, Philadelphia, PA, USA, April 9-11, 2013, pages 55-64. IEEE Computer Society, 2013. URL: http://dx.doi.org/10.1109/RTAS.2013.6531079.
  21. Reza Zamani and Ahmad Afsahi. A study of hardware performance monitoring counter selection in power modeling of computing systems. In 2012 International Green Computing Conference, IGCC 2012, San Jose, CA, USA, June 4-8, 2012, pages 1-10. IEEE Computer Society, 2012. URL: http://dx.doi.org/10.1109/IGCC.2012.6322289.
  22. Michael Zimmer, David Broman, Chris Shaver, and Edward A. Lee. FlexPRET: A processor platform for mixed-criticality systems. In 20th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS 2014, Berlin, Germany, April 15-17, 2014, pages 101-110. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/RTAS.2014.6925994.
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail