Modeling and Analysis of Bus Contention for Hardware Accelerators in FPGA SoCs

Restuccia, Francesco; Pagani, Marco; Biondi, Alessandro; Marinoni, Mauro; Buttazzo, Giorgio

doi:10.4230/LIPIcs.ECRTS.2020.12

Abstract

FPGA System-on-Chips (SoCs) are heterogeneous platforms that combine general-purpose processors with a field-programmable gate array (FPGA) fabric. The FPGA fabric is composed of a programmable logic in which hardware accelerators can be deployed to accelerate the execution of specific functionality. The main source of unpredictability when bounding the execution times of hardware accelerators pertains the access to the shared memories via the on-chip bus. This work is focused on bounding the worst-case bus contention experienced by the hardware accelerators deployed in the FPGA fabric. To this end, this work considers the AMBA AXI bus, which is the de-facto standard communication interface used in most the commercial off-the-shelf (COTS) FPGA SoCs, and presents an analysis technique to bound the response times of hardware accelerators implemented on such platforms. A fine-grained modeling of the AXI bus and AXI interconnects is first provided. Then, contention delays are studied under hierarchical bus infrastructures with arbitrary depths. Experimental results are finally presented to validate the proposed model with execution traces on two modern FPGA-based SoC produced by Xilinx (Zynq-7000 and Zynq-Ultrascale+ families) and to assess the performance of the proposed analysis.

Benny Akesson, Kees Goossens, and Markus Ringhofer. Predator: a predictable SDRAM memory controller. In Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pages 251-256. ACM, 2007.
ARM. AMBA AXI and ACE Protocol Specification, 2011.
A. Biondi, A. Balsini, M. Pagani, E. Rossi, M. Marinoni, and G. Buttazzo. A framework for supporting real-time applications on dynamic reconfigurable fpgas. In 2016 IEEE Real-Time Systems Symposium (RTSS), pages 1-12, 2016.
D. Casini, A. Biondi, G. Nelissen, and G. Buttazzo. A holistic memory contention analysis for parallel real-time tasks under partitioned scheduling. In Proceedings of the 26th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2020), 2020.
W. Chang, D. Goswami, S. Chakraborty, L. Ju, C. J. Xue, and S. Andalam. Memory-aware embedded control systems design. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36(4):586-599, April 2017. URL: https://doi.org/10.1109/TCAD.2016.2613933.
Sudipta Chattopadhyay, Lee Kee Chong, Abhik Roychoudhury, Timon Kelter, Peter Marwedel, and Heiko Falk. A unified WCET analysis framework for multicore platforms. ACM Transactions on Embedded Computing Systems (TECS), 13(4s):124, 2014.
Paul Emberson, Roger Stafford, and Robert I Davis. Techniques for the synthesis of multiprocessor tasksets. In proceedings 1st International Workshop on Analysis Tools and Methodologies for Embedded and Real-time Systems (WATERS 2010), pages 6-11, 2010.
Gabriel Fernandez, Javier Jalle, Jaume Abella, Eduardo Qui~nones, Tullio Vardanega, and Francisco J. Cazorla. Increasing confidence on measurement-based contention bounds for real-time round-robin buses. In Proceedings of the 52nd Annual Design Automation Conference, DAC ’15, New York, NY, USA, 2015. Association for Computing Machinery. URL: https://doi.org/10.1145/2744769.2744858.
Nan Guan, Martin Stigge, Wang Yi, and Ge Yu. Cache-aware scheduling and analysis for multicores. In Proceedings of the seventh ACM international conference on Embedded software, pages 245-254. ACM, 2009.
Kaiyuan Guo, Shulin Zeng, Jincheng Yu, Yu Wang, and Huazhong Yang. A Survey of FPGA-based Neural Network Inference Accelerators. ACM Transactions on Reconfigurable Technology and Systems (TRETS), 12(1):2, 2019.
Mohamed Hassan and Rodolfo Pellizzoni. Bounding DRAM interference in COTS heterogeneous MPSoCs for mixed criticality systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 37(11):2323-2336, 2018.
F. Hebbache, M. Jan, F. Brandner, and L. Pautet. Shedding the shackles of time-division multiplexing. In 2018 IEEE Real-Time Systems Symposium (RTSS), pages 456-468, December 2018. URL: https://doi.org/10.1109/RTSS.2018.00059.
Intel. Stratix 10 GX/SX Device Overview, October 2017.
Intel FPGA. Custom IP Development Using Avalon® and Arm AMBA AXI Interfaces. OQSYS3000.
J. Jalle, L. Kosmidis, J. Abella, E. Quiñones, and F. J. Cazorla. Bus designs for time-probabilistic multicore processors. In 2014 Design, Automation Test in Europe Conference Exhibition (DATE), pages 1-6, March 2014. URL: https://doi.org/10.7873/DATE.2014.063.
H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar. Bounding memory interference delay in COTS-based multi-core systems. In 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS), April 2014.
Hyoseung Kim, Dionisio de Niz, Björn Andersson, Mark Klein, Onur Mutlu, and Ragunathan Rajkumar. Bounding and reducing memory interference in COTS-based multi-core systems. Real-Time Systems, 52(3):356-395, May 2016.
Jörg Henkel Lars Bauer, Marvin Damschen. Runtime-reconfigurable architectures for WCET guarantees and mixed criticality. In Special session at ESWEEK 2019: Analyses and Architectures for Mixed-Critical Systems: Industry Trends and Research Perspective. ACM, 2019.
Mingsong Lv, Nan Guan, Jan Reineke, Reinhard Wilhelm, and Wang Yi. A survey on static cache analysis for real-time systems. Leibniz Transactions on Embedded Systems, 3(1):05-1-05:48, 2016. URL: https://doi.org/10.4230/LITES-v003-i001-a005.
Geoffrey Nelissen and Alessandro Biondi. The SRP Resource Sharing Protocol for Self-Suspending Tasks. In 2018 IEEE Real-Time Systems Symposium (RTSS), pages 361-372. IEEE, 2018.
Marco Pagani, Alessio Balsini, Alessandro Biondi, Mauro Marinoni, and Giorgio Buttazzo. A linux-based support for developing real-time applications on heterogeneous platforms with dynamic fpga reconfiguration. In 2017 30th IEEE International System-on-Chip Conference (SOCC), pages 96-101. IEEE, 2017.
Marco Pagani, Enrico Rossi, Alessandro Biondi, Mauro Marinoni, Giuseppe Lipari, and Giorgio Buttazzo. A Bandwidth Reservation Mechanism for AXI-Based Hardware Accelerators on FPGAs. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019), volume 133 of Leibniz International Proceedings in Informatics (LIPIcs), pages 24:1-24:24, Dagstuhl, Germany, 2019. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
Francesco Restuccia, Alessandro Biondi, Mauro Marinoni, and Giorgio Buttazzo. Safely Preventing Unbounded Delays During Bus Transactions in FPGA-based SoC. In 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2020.
Francesco Restuccia, Alessandro Biondi, Mauro Marinoni, Giorgiomaria Cicero, and Giorgio Buttazzo. AXI HyperConnect: A Predictable, Hypervisor-level AXI Interconnect for Hardware Accelerators in FPGA SoC. In Proceedings of the 57th ACM/IEEE Design Automation Conference (DAC 2020), 2020.
Francesco Restuccia, Marco Pagani, Alessandro Biondi, Mauro Marinoni, and Giorgio Buttazzo. Is Your Bus Arbiter Really Fair? Restoring Fairness in AXI Interconnects for FPGA SoCs. ACM Trans. Embedded Computing Systems, 18(5s):51:1-51:22, October 2019.
M. Slijepcevic, C. Hernandez, J. Abella, and F. J. Cazorla. Design and implementation of a fair credit-based bandwidth sharing scheme for buses. In Design, Automation Test in Europe Conference Exhibition (DATE), 2017, pages 926-929, March 2017. URL: https://doi.org/10.23919/DATE.2017.7927122.
Yaman Umuroglu, Nicholas J Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. Finn: A framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pages 65-74. ACM, 2017.
Xilinx. Zynq-7000 All Programmable SoC - Reference Manual, September 2016. UG585.
Xilinx. AXI Performance Monitor v5.0, 2017. PG037.
Xilinx. Vivado Design Suite: AXI Reference Guide, July 2017. UG1037.
Xilinx. Zynq UltraScale+ Device - Reference Manual, December 2017. UG1085.
Xilinx. AXI Interconnect, LogiCORE IP Product Guide, 2018. PG059.
Xilinx Inc. The CHaiDNN official github website. https://github.com/Xilinx/chaidnn.
Xilinx Inc. Integrated Logic Analyzer, LogiCORE IP Product Guide, 2016. PG172.
Xilinx Inc. SmartConnect, LogiCORE IP Product Guide, 2018. PG247.

Modeling and Analysis of Bus Contention for Hardware Accelerators in FPGA SoCs

Authors Francesco Restuccia, Marco Pagani, Alessandro Biondi, Mauro Marinoni, Giorgio Buttazzo

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Modeling and Analysis of Bus Contention for Hardware Accelerators in FPGA SoCs

Authors Francesco Restuccia, Marco Pagani, Alessandro Biondi, Mauro Marinoni, Giorgio Buttazzo

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Supplementary Materials

References

Thanks for your feedback!

Could not send message