Throughput and Memory Optimization for Parallel Implementations of Dataflow Networks Using Multi-Reader Buffers

Authors Martin Letras , Joachim Falk, Jürgen Teich



PDF
Thumbnail PDF

File

OASIcs.NG-RES.2023.6.pdf
  • Filesize: 1.55 MB
  • 13 pages

Document Identifiers

Author Details

Martin Letras
  • Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Germany
Joachim Falk
  • Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Germany
Jürgen Teich
  • Friedrich-Alexander Universität Erlangen-Nürnberg (FAU), Germany

Cite AsGet BibTex

Martin Letras, Joachim Falk, and Jürgen Teich. Throughput and Memory Optimization for Parallel Implementations of Dataflow Networks Using Multi-Reader Buffers. In Fourth Workshop on Next Generation Real-Time Embedded Systems (NG-RES 2023). Open Access Series in Informatics (OASIcs), Volume 108, pp. 6:1-6:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/OASIcs.NG-RES.2023.6

Abstract

In this paper, we introduce the concept of Multi-Reader Buffers (MRBs) for high throughput and memory-efficient implementation of dataflow applications. Our work is motivated by the huge amount of data that needs to be processed and typically accessed in a FIFO manner, particularly in image and video processing applications. Here, multi-cast, fork, and merge operator implementations known today produce huge memory overheads by storing and communicating copies of the same data. As a remedy, we first introduce MRBs as buffers preserving FIFO semantics for a finite number of readers of the same data while storing each data item only once. Second, we present an approach for memory minimization of data flow networks by replacing all multi-cast actors and connected FIFOs with MRBs. Third, we present a Design Space Exploration approach to selectively replace multi-cast actors with MRBs in order to explore memory, throughput, and processor resource allocation tradeoffs. Our results show that the explored Pareto fronts of our approach improve the solution quality over a reference by 78% in average for six benchmark applications in terms of a hypervolume indicator.

Subject Classification

ACM Subject Classification
  • Hardware → Static timing analysis
Keywords
  • Dataflow
  • Memory Optimization
  • MPSoCs
  • Design Space Exploration

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Mohamed Benazouz, Olivier Marchetti, Alix Munier-Kordon, and Pascal Urard. A new approach for minimizing buffer capacities with throughput constraint for embedded system design. In ACS/IEEE International Conference on Computer Systems and Applications (AICCSA), pages 1-8, 2010. URL: https://doi.org/10.1109/AICCSA.2010.5586972.
  2. Tobias Blickle, Jürgen Teich, and Lothar Thiele. System-level synthesis using evolutionary algorithms. Design Automation for Embedded Systems, 3(1):23-58, 1998. Google Scholar
  3. K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. Trans. Evol. Comp, 6(2):182-197, April 2002. Google Scholar
  4. Jack Dennis. First Version of a Data Flow Procedure Language. In B. Robinet, editor, Programming Symposium, volume 19 of Lecture Notes in Computer Science, pages 362-376. Springer-Verlag, Berlin, Heidelberg, 1974. Google Scholar
  5. Karol Desnos, Maxime Pelcat, Jean-François Nezan, and Slaheddine Aridhi. On Memory Reuse Between Inputs and Outputs of Dataflow Actors. ACM Transactions on Embedded Computing Systems, 15(2), February 2016. URL: https://doi.org/10.1145/2871744.
  6. Karol Desnos, Maxime Pelcat, Jean-François Nezan, and Slaheddine Aridhi. Memory analysis and optimized allocation of dataflow applications on shared-memory mpsocs. Journal of Signal Processing Systems, 80(1):19-37, 2015. Google Scholar
  7. J. Falk, J. Keinert, C. Haubelt, J. Teich, and S. Bhattacharyya. A Generalized Static Data Flow Clustering Algorithm for MPSoC Scheduling of Multimedia Applications. In Proceedings of ACM International Conference on Embedded Software, pages 189-198, October 2008. Google Scholar
  8. Joachim Falk, Christian Haubelt, and Jürgen Teich. Efficient representation and simulation of model-based designs in SystemC. In Proceedings of the Forum on Specification and Design Languages, volume 6, 2006. Google Scholar
  9. Andreia P. Guerreiro, Carlos M. Fonseca, and Luís Paquete. The hypervolume indicator: Computational problems and algorithms. ACM Comput. Surv., 54(6), July 2021. URL: https://doi.org/10.1145/3453474.
  10. J. Keinert, M. Streubühr, T. Schlichter, J. Falk, J. Gladigau, C. Haubelt, J. Teich, and M. Meredith. SCD - an Automatic ESL Synthesis Approach by Design Space Exploration and Behavioral Synthesis for Streaming Applications. ACM Trans. on Design Automation of Electronic Systems, 14(1):1:1-1:23, January 2009. Google Scholar
  11. Martin Letras, Joachim Falk, Tobias Schwarzer, and Jürgen Teich. Multi-objective Optimization of Mapping Dataflow Applications to MPSoCs Using a Hybrid Evaluation Combining Analytic Models and Measurements. ACM Trans. on Design Automation of Electronic Systems, 26:1-33, 2020. URL: https://doi.org/10.1145/3431814.
  12. Martin Letras, Joachim Falk, Stefan Wildermann, and Jürgen Teich. Automatic Conversion of Simulink Models to SysteMoC Actor Networks. In Proc. of SCOPES, pages 81-84, New York, NY, USA, 2017. ACM. URL: https://doi.org/10.1145/3078659.3078668.
  13. Amith R. Mamidala, Daniel Faraj, Sameer Kumar, Douglas Miller, Michael Blocksome, Thomas Gooding, Philip Heidelberger, and Gabor Dozsa. Optimizing mpi collectives using efficient intra-node communication techniques over the blue gene/p supercomputer. In 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pages 771-780, 2011. URL: https://doi.org/10.1109/IPDPS.2011.220.
  14. OpenDSE. "Open Design Space Exploration Framework", 2018. URL: http://opendse.sf.net/.
  15. S. Stuijk, M. Geilen, and T. Basten. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In Proceedings of Design Automation Conference (DAC), pages 899-904, 2006. URL: https://doi.org/10.1145/1146909.1147138.
  16. J. Teich. Hardware/Software Codesign: The Past, the Present, and Predicting the Future. Proc. IEEE, 100(Special Centennial Issue):1411-1430, May 2012. Google Scholar
  17. L. Thiele, K. Strehl, D. Ziegengein, R. Ernst, and J. Teich. Funstate-an internal design representation for codesign. In 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051), pages 558-565, 1999. URL: https://doi.org/10.1109/ICCAD.1999.810711.
  18. Hervé Yviquel, Alexandre Sanchez, Pekka Jääskeläinen, Jarmo Takala, Mickaël Raulet, and Emmanuel Casseau. Embedded multi-core systems dedicated to dynamic dataflow programs. Journal of Signal Processing Systems, 80(1):121-136, 2015. Google Scholar