Search Results

Documents authored by Kefallinos, Dionysios


Document
Performance Modeling & Mapping of LLM Inference on Heterogeneous Vectorized CGRAs

Authors: Dionysios Kefallinos, Georgios Alexandris, Alexis Maras, Panagiotis Chaidos, Manil Dev Gomony, Henk Corporaal, Dimitrios Soudris, and Sotirios Xydis

Published in: OASIcs, Volume 141, 17th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 15th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2026)


Abstract
Since the emergence of transformer-based models, the computational demands for Large Language Model (LLM) inference have been increasing exponentially, primarily due to their compounding parameter sizes, their structural complexity, and the use of non-linear functions. This tendency leads to the necessity of deploying them on low-power edge devices and DNN accelerators, to fuel next-generation agentic AI systems. Coarse-Grained Reconfigurable Architectures (CGRAs) have proven to be a compelling paradigm for edge acceleration, combining the programmability of general-purpose platforms with the high performance and energy efficiency associated with ASICs. In this work, we introduce an end-to-end performance modeling and mapping framework for LLM inference on heterogeneous CGRAs. Our methodology enables rapid exploration of the micro-architectural design space parameters, i.e., the number of processing elements, vector sizes, and memory configurations, by providing an accurate, explainable, and analytical CGRA performance modeling methodology, with an average cycle error of 0.9%. Architecturally, we build upon R-Blocks, a heterogeneous CGRA platform, and extend it to support floating-point arithmetic operations as well as a full-stack compilation and mapping flow for both full (FP32) and quantized (INT8) Llama2 models. The proposed methodology, evaluated on a 22nm technology node, achieves superior peak performance per Watt compared to related works such as REVAMP and CFEACT (1.8× and 2.8× respectively).

Cite as

Dionysios Kefallinos, Georgios Alexandris, Alexis Maras, Panagiotis Chaidos, Manil Dev Gomony, Henk Corporaal, Dimitrios Soudris, and Sotirios Xydis. Performance Modeling & Mapping of LLM Inference on Heterogeneous Vectorized CGRAs. In 17th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 15th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2026). Open Access Series in Informatics (OASIcs), Volume 141, pp. 8:1-8:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{kefallinos_et_al:OASIcs.PARMA-DITAM.2026.8,
  author =	{Kefallinos, Dionysios and Alexandris, Georgios and Maras, Alexis and Chaidos, Panagiotis and Gomony, Manil Dev and Corporaal, Henk and Soudris, Dimitrios and Xydis, Sotirios},
  title =	{{Performance Modeling \& Mapping of LLM Inference on Heterogeneous Vectorized CGRAs}},
  booktitle =	{17th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 15th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2026)},
  pages =	{8:1--8:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-416-1},
  ISSN =	{2190-6807},
  year =	{2026},
  volume =	{141},
  editor =	{Baroffio, Davide and Busia, Paola and Denisov, Lev and Shukla, Nitin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2026.8},
  URN =		{urn:nbn:de:0030-drops-256752},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2026.8},
  annote =	{Keywords: Edge AI, LLM, CGRA, Heterogeneous Architectures, Performance Modeling, Hardware Acceleration, Low Power Computing}
}
Document
An Experimental Study of Algorithms for Packing Arborescences

Authors: Loukas Georgiadis, Dionysios Kefallinos, Anna Mpanti, and Stavros D. Nikolopoulos

Published in: LIPIcs, Volume 233, 20th International Symposium on Experimental Algorithms (SEA 2022)


Abstract
A classic result of Edmonds states that the maximum number of edge-disjoint arborescences of a directed graph G, rooted at a designated vertex s, equals the minimum cardinality c_G(s) of an s-cut of G. This concept is related to the edge connectivity λ(G) of a strongly connected directed graph G, defined as the minimum number of edges whose deletion leaves a graph that is not strongly connected. In this paper, we address the question of how efficiently we can compute a maximum packing of edge-disjoint arborescences in practice, compared to the time required to determine the edge connectivity of a graph. To that end, we explore the design space of efficient algorithms for packing arborescences of a directed graph in practice and conduct a thorough empirical study to highlight the merits and weaknesses of each technique. In particular, we present an efficient implementation of Gabow’s arborescence packing algorithm and provide a simple but efficient heuristic that significantly improves its running time in practice.

Cite as

Loukas Georgiadis, Dionysios Kefallinos, Anna Mpanti, and Stavros D. Nikolopoulos. An Experimental Study of Algorithms for Packing Arborescences. In 20th International Symposium on Experimental Algorithms (SEA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 233, pp. 14:1-14:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{georgiadis_et_al:LIPIcs.SEA.2022.14,
  author =	{Georgiadis, Loukas and Kefallinos, Dionysios and Mpanti, Anna and Nikolopoulos, Stavros D.},
  title =	{{An Experimental Study of Algorithms for Packing Arborescences}},
  booktitle =	{20th International Symposium on Experimental Algorithms (SEA 2022)},
  pages =	{14:1--14:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-251-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{233},
  editor =	{Schulz, Christian and U\c{c}ar, Bora},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2022.14},
  URN =		{urn:nbn:de:0030-drops-165480},
  doi =		{10.4230/LIPIcs.SEA.2022.14},
  annote =	{Keywords: Arborescences, Edge Connectivity, Graph Algorithms}
}
Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail