SO(DA)^2: End-to-end Generation of Specialized Reconfigurable Architectures (Invited Talk)

Authors Antonino Tumeo , Nicolas Bohm Agostini, Serena Curzel, Ankur Limaye, Cheng Tan, Vinay Amatya, Marco Minutoli, Vito Giovanni Castellana, Ang Li, Joseph Manzano

Thumbnail PDF


  • Filesize: 2.4 MB
  • 15 pages

Document Identifiers

Author Details

Antonino Tumeo
  • Pacific Northwest National Laboratory, Richland, WA, USA
Nicolas Bohm Agostini
  • Pacific Northwest National Laboratory, Atlanta, GA, USA
  • Northeastern University, Boston, MA, USA
Serena Curzel
  • Pacific Northwest National Laboratory, Richland, WA, USA
  • Politecnico di Milano, Italy
Ankur Limaye
  • Pacific Northwest National Laboratory, Richland, WA, USA
Cheng Tan
  • Microsoft, Seattle, WA, USA
Vinay Amatya
  • Pacific Northwest National Laboratory, Richland, WA, USA
Marco Minutoli
  • Pacific Northwest National Laboratory, Richland, WA, USA
Vito Giovanni Castellana
  • Pacific Northwest National Laboratory, Richland, WA, USA
Ang Li
  • Pacific Northwest National Laboratory, Richland, WA, USA
Joseph Manzano
  • Pacific Northwest National Laboratory, Richland, WA, USA


The research described in this paper is part of the Data-Model Convergence (DMC) Initiative at Pacific Northwest National Laboratory. It was conducted under the Laboratory Directed Research and Development Program at PNNL, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy.

Cite AsGet BibTex

Antonino Tumeo, Nicolas Bohm Agostini, Serena Curzel, Ankur Limaye, Cheng Tan, Vinay Amatya, Marco Minutoli, Vito Giovanni Castellana, Ang Li, and Joseph Manzano. SO(DA)^2: End-to-end Generation of Specialized Reconfigurable Architectures (Invited Talk). In 13th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 11th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2022). Open Access Series in Informatics (OASIcs), Volume 100, pp. 1:1-1:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Modern data analysis applications are complex workflows composed of algorithms with diverse behaviors. They may include digital signal processing, data filtering, reduction, compression, graph algorithms, and machine learning. Their performance is highly dependent on the volume, the velocity, and the structure of the data. They are used in many different domains (from small, embedded devices, to large-scale, high-performance computing systems) but in all cases they need to provide answers with very low latency to enable real-time decision making and autonomy. Coarse-grained reconfigurable arrays (CGRAs), i.e., architectures composed of functional units able to perform complex operations interconnected through a network-on-chip and configure the datapath to map complex kernels, are a promising platform to accelerate these applications thanks to their adaptability. They provide higher flexibility than application-specific integrated circuits (ASICs) while offering increased energy efficiency and faster reconfiguration speed with respect to field-programmable gate arrays (FPGAs). However, designing and specializing CGRAs requires significant efforts. The inherent flexibility of these devices makes the application mapping process equally important to the hardware design generation. To obtain efficient systems, approaches that simultaneously considers software and hardware optimizations are necessary. In this paper, we discuss the Software Defined Architectures for Data Analytics (SO(DA)²) toolchain, an end-to-end hardware/software codesign framework to generate custom reconfigurable architectures for data analytics applications. (SO(DA)²) is composed of a high-level compiler (SODA-OPT) and a hardware generator (OpenCGRA) and can automatically explore and generate optimal CGRA designs starting from high-level programming frameworks. SO(DA)² considers partial dynamic reconfiguration as key element of the system design. We discuss the various elements of the framework and demonstrate the flow on the case study of a partial dynamic reconfigurable CGRA design for data streaming applications.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Reconfigurable computing
  • Reconfigurable architectures
  • data analytics


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Mflowgen. URL:
  2. E. Bethel and eds. Report of the doe workshop on management, analysis, and visualization of experimental and observational data – the convergence of data and computing. Technical report, Lawrence Berkeley National Laboratory, 2016. Google Scholar
  3. Marco Branca, Lorenzo Camerini, Fabrizio Ferrandi, Pier Luca Lanzi, Christian Pilato, Donatella Sciuto, and Antonino Tumeo. Evolutionary algorithms for the mapping of pipelined applications onto heterogeneous embedded systems. In Franz Rothlauf, editor, Genetic and Evolutionary Computation Conference, GECCO 2009, Proceedings, Montreal, Québec, Canada, July 8-12, 2009, pages 1435-1442. ACM, 2009. Google Scholar
  4. Vito Giovanni Castellana, Marco Minutoli, Antonino Tumeo, Marco Lattuada, Pietro Fezzardi, and Fabrizio Ferrandi. Software defined architectures for data analytics. In Toshiyuki Shibuya, editor, Proceedings of the 24th Asia and South Pacific Design Automation Conference, ASPDAC 2019, Tokyo, Japan, January 21-24, 2019, pages 711-718. ACM, 2019. Google Scholar
  5. Anupam Chattopadhyay, Xiaolin Chen, Harold Ishebabi, Rainer Leupers, Gerd Ascheid, and Heinrich Meyr. High-level modelling and exploration of coarse-grained re-configurable architectures. In Proceedings of the conference on Design, automation and test in Europe, pages 1334-1339, 2008. Google Scholar
  6. Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, and Krisztian Flautner. Application-specific processing on a general-purpose core via transparent instruction set customization. In 37th international symposium on microarchitecture (MICRO-37'04), pages 30-40. IEEE, 2004. Google Scholar
  7. Fabrizio Ferrandi, Vito Giovanni Castellana, Serena Curzel, Pietro Fezzardi, Michele Fiorito, Marco Lattuada, Marco Minutoli, Christian Pilato, and Antonino Tumeo. Invited: Bambu: an open-source research framework for the high-level synthesis of complex applications. In DAC' 21: 58th ACM/IEEE Design Automation Conference, pages 1327-1330, 2021. Google Scholar
  8. Fabrizio Ferrandi, Pier Luca Lanzi, Christian Pilato, Donatella Sciuto, and Antonino Tumeo. Ant colony heuristic for mapping and scheduling tasks and communications on heterogeneous embedded systems. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 29(6):911-924, 2010. Google Scholar
  9. Reiner Hartenstein, Michael Herz, Thomas Hoffmann, and Ulrich Nageldinger. KressArray Xplorer: A new CAD environment to optimize reconfigurable datapath array architectures. In Proceedings 2000. Design Automation Conference.(IEEE Cat. No. 00CH37106), pages 163-168. IEEE, 2000. Google Scholar
  10. Shunning Jiang, Peitian Pan, Yanghui Ou, and Christopher Batten. PyMTL3: a Python framework for open-source hardware modeling, generation, simulation, and verification. IEEE Micro, 40(4):58-66, 2020. Google Scholar
  11. Manupa Karunaratne, Aditi Kulkarni Mohite, Tulika Mitra, and Li-Shiuan Peh. Hycube: A cgra with reconfigurable single-cycle multi-hop interconnect. In Proceedings of the 54th Annual Design Automation Conference 2017, pages 1-6, 2017. Google Scholar
  12. Yoonjin Kim, Rabi N Mahapatra, and Kiyoung Choi. Design space exploration for efficient resource utilization in coarse-grained reconfigurable architecture. IEEE transactions on very large scale integration (VLSI) systems, 18(10):1471-1482, 2009. Google Scholar
  13. David Koeplinger, Matthew Feldman, Raghu Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, Ardavan Pedram, Christos Kozyrakis, et al. Spatial: A language and compiler for application accelerators. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 296-311, 2018. Google Scholar
  14. Hyoukjun Kwon, Ananda Samajdar, and Tushar Krishna. Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. ACM SIGPLAN Notices, 53(2):461-475, 2018. Google Scholar
  15. Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko. MLIR: Scaling compiler infrastructure for domain specific computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 2-14. IEEE, 2021. Google Scholar
  16. Marco Minutoli, Vito Giovanni Castellana, Cheng Tan, Joseph B. Manzano, Vinay Amatya, Antonino Tumeo, David Brooks, and Gu-Yeon Wei. SODA: a new synthesis infrastructure for agile hardware design of machine learning accelerators. In ICCAD '20: IEEE/ACM International Conference On Computer Aided Design, pages 98:1-98:7, 2020. Google Scholar
  17. Hyunchul Park, Yongjun Park, and Scott Mahlke. Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia applications. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pages 370-380, 2009. Google Scholar
  18. Artur Podobas, Kentaro Sano, and Satoshi Matsuoka. A survey on coarse-grained reconfigurable architectures from a performance perspective. IEEE Access, 8:146719-146743, 2020. Google Scholar
  19. Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, Ardavan Pedram, Christos Kozyrakis, and Kunle Olukotun. Plasticine: A reconfigurable architecture for parallel patterns. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), pages 389-402. IEEE, 2017. Google Scholar
  20. Cheng Tan, Nicolas Bohm Agostini, Tong Geng, Chenghao Xie, Jiajia Li, Ang Li, Kevin Barker, and Antonino Tumeo. DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs. In 2022 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2022. Google Scholar
  21. Cheng Tan, Tong Geng, Chenhao Xie, Nicolas Bohm Agostini, Jiajia Li, Ang Li, Kevin J. Barker, and Antonino Tumeo. Dynpac: Coarse-grained, dynamic, and partially reconfigurable array for streaming applications. In 39th IEEE International Conference on Computer Design, ICCD 2021, Storrs, CT, USA, October 24-27, 2021, pages 33-40. IEEE, 2021. Google Scholar
  22. Cheng Tan, Manupa Karunaratne, Tulika Mitra, and Li-Shiuan Peh. Stitch: Fusible heterogeneous accelerators enmeshed with many-core architecture for wearables. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pages 575-587. IEEE, 2018. Google Scholar
  23. Cheng Tan, Yanghui Ou, Shunning Jiang, Peitian Pan, Christopher Torng, Shady Agwa, and Christopher Batten. Pyocn: A unified framework for modeling, testing, and evaluating on-chip networks. In 2019 IEEE 37th International Conference on Computer Design (ICCD), pages 437-445. IEEE, 2019. Google Scholar
  24. Cheng Tan, Chenhao Xie, Tong Geng, Andres Marquez, Antonino Tumeo, Kevin J Barker, and Ang Li. Arena: Asynchronous reconfigurable accelerator ring to enable data-centric parallel computing. IEEE Transactions on Parallel and Distributed Systems, 2021. Google Scholar
  25. Cheng Tan, Chenhao Xie, Ang Li, Kevin J Barker, et al. AURORA: Automated Refinement of Coarse-Grained Reconfigurable Accelerators. In The 2021 Design, Automation & Test in Europe Conference (DATE). IEEE, 2021. Google Scholar
  26. Cheng Tan, Chenhao Xie, Ang Li, Kevin J Barker, and Antonino Tumeo. OpenCGRA: An open-source unified framework for modeling, testing, and evaluating CGRAs. In 2020 IEEE 38th International Conference on Computer Design (ICCD), pages 381-388. IEEE, 2020. Google Scholar
  27. Jian Weng, Sihao Liu, Vidushi Dadu, Zhengrong Wang, Preyas Shah, and Tony Nowatzki. Dsagen: Synthesizing programmable spatial accelerators. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pages 268-281. IEEE, 2020. Google Scholar
  28. Max Willsey, Vincent T Lee, Alvin Cheung, Rastislav Bodík, and Luis Ceze. Iterative search for reconfigurable accelerator blocks with a compiler in the loop. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(3):407-418, 2018. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail