Search Results

Documents authored by Zaidi, Ali Mustafa


Document
Achieving Superscalar Performance without Superscalar Overheads - A Dataflow Compiler IR for Custom Computing

Authors: Ali Mustafa Zaidi and David J. Greaves

Published in: OASIcs, Volume 35, 2013 Imperial College Computing Student Workshop


Abstract
The difficulty of effectively parallelizing code for multicore processors, combined with the end of threshold voltage scaling has resulted in the problem of 'Dark Silicon', severely limiting performance scaling despite Moore's Law. To address dark silicon, not only must we drastically improve the energy efficiency of computation, but due to Amdahl's Law, we must do so without compromising sequential performance. Designers increasingly utilize custom hardware to dramatically improve both efficiency and performance in increasingly heterogeneous architectures. Unfortunately, while it efficiently accelerates numeric, data-parallel applications, custom hardware often exhibits poor performance on sequential code, so complex, power-hungry superscalar processors must still be utilized. This paper addresses the problem of improving sequential performance in custom hardware by (a) switching from a statically scheduled to a dynamically scheduled (dataflow) execution model, and (b) developing a new compiler IR for high-level synthesis that enables aggressive exposition of ILP even in the presence of complex control flow. This new IR is directly implemented as a static dataflow graph in hardware by our high-level synthesis tool-chain, and shows an average speedup of 1.13 times over equivalent hardware generated using LegUp, an existing HLS tool. In addition, our new IR allows us to further trade area & energy for performance, increasing the average speedup to 1.55 times, through loop unrolling, with a peak speedup of 4.05 times. Our custom hardware is able to approach the sequential cycle-counts of an Intel Nehalem Core i7 superscalar processor, while consuming on average only 0.25 times the energy of an in-order Altera Nios IIf processor.

Cite as

Ali Mustafa Zaidi and David J. Greaves. Achieving Superscalar Performance without Superscalar Overheads - A Dataflow Compiler IR for Custom Computing. In 2013 Imperial College Computing Student Workshop. Open Access Series in Informatics (OASIcs), Volume 35, pp. 136-143, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)


Copy BibTex To Clipboard

@InProceedings{zaidi_et_al:OASIcs.ICCSW.2013.136,
  author =	{Zaidi, Ali Mustafa and Greaves, David J.},
  title =	{{Achieving Superscalar Performance without Superscalar Overheads - A Dataflow Compiler IR for Custom Computing}},
  booktitle =	{2013 Imperial College Computing Student Workshop},
  pages =	{136--143},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-63-7},
  ISSN =	{2190-6807},
  year =	{2013},
  volume =	{35},
  editor =	{Jones, Andrew V. and Ng, Nicholas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.ICCSW.2013.136},
  URN =		{urn:nbn:de:0030-drops-42825},
  doi =		{10.4230/OASIcs.ICCSW.2013.136},
  annote =	{Keywords: High-level Synthesis, Instruction Level Parallelism, Custom Computing, Compilers, Dark Silicon}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail