Precision Tuning in Parallel Applications

Authors Gabriele Magnani , Lev Denisov , Daniele Cattaneo , Giovanni Agosta

Thumbnail PDF


  • Filesize: 0.62 MB
  • 9 pages

Document Identifiers

Author Details

Gabriele Magnani
  • DEIB - Politecnico di Milano, Italy
Lev Denisov
  • DEIB - Politecnico di Milano, Italy
Daniele Cattaneo
  • DEIB - Politecnico di Milano, Italy
Giovanni Agosta
  • DEIB - Politecnico di Milano, Italy

Cite AsGet BibTex

Gabriele Magnani, Lev Denisov, Daniele Cattaneo, and Giovanni Agosta. Precision Tuning in Parallel Applications. In 13th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 11th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2022). Open Access Series in Informatics (OASIcs), Volume 100, pp. 5:1-5:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Nowadays, parallel applications are used every day in high performance computing, scientific computing and also in everyday tasks due to the pervasiveness of multi-core architectures. However, several implementation challenges have so far stifled the integration of parallel applications and automatic precision tuning. First of all, tuning a parallel application introduces difficulties in the detection of the region of code that must be affected by the optimization. Moreover, additional challenges arise in handling shared variables and accumulators. In this work we address such challenges by introducing OpenMP parallel programming support to the TAFFO precision tuning framework. With our approach we achieve speedups up to 750% with respect to the same parallel application without precision tuning.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Compilers
  • Theory of computation → Parallel computing models
  • Compilers
  • Parallel Programming
  • Precision Tuning


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Marco Aldinucci, Valentina Cesare, Iacopo Colonnelli, Alberto Riccardo Martinelli, Gianluca Mittone, Barbara Cantalupo, Carlo Cavazzoni, and Maurizio Drocco. Practical parallelization of scientific applications with openmp, openacc and mpi. Journal of Parallel and Distributed Computing, 157:13-29, November 2021. URL:
  2. Marc Baboulin et al. Accelerating scientific computations with mixed precision algorithms. Computer Physics Communications, 180:2526-2533, December 2009. URL:
  3. Daniele Cattaneo, Michele Chiari, Stefano Cherubin, and Giovanni Agosta. Feedback-driven performance and precision tuning for automatic fixed point exploitation. In International Conference on Parallel Computing, ParCo, September 2019. Google Scholar
  4. Daniele Cattaneo, Michele Chiari, Nicola Fossati, Stefano Cherubin, and Giovanni Agosta. Architecture-aware precision tuning with multiple number representation systems. In Proceedings of the 58th Annual Design Automation Conference (DAC), December 2021 (to appear). Google Scholar
  5. Stefano Cherubin and Giovanni Agosta. Tools for reduced precision computation: a survey. ACM Computing Surveys, 53(2), April 2020. URL:
  6. Stefano Cherubin, Giovanni Agosta, Imane Lasri, Erven Rohou, and Olivier Sentieys. Implications of Reduced-Precision Computations in HPC: Performance, Energy and Error. In International Conference on Parallel Computing (ParCo), September 2017. Google Scholar
  7. Stefano Cherubin, Daniele Cattaneo, Michele Chiari, and Giovanni Agosta. Dynamic precision autotuning with TAFFO. ACM Trans. Archit. Code Optim., 17(2), May 2020. URL:
  8. Stefano Cherubin, Daniele Cattaneo, Michele Chiari, Antonio Di Bello, and Giovanni Agosta. TAFFO: Tuning assistant for floating to fixed point optimization. IEEE Embedded Systems Letters, 2019. URL:
  9. Leonardo Dagum and Ramesh Menon. Openmp: an industry standard api for shared-memory programming. Computational Science & Engineering, IEEE, 5(1):46-55, 1998. Google Scholar
  10. Eva Darulova et al. Sound mixed-precision optimization with rewriting. In Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems, ICCPS '18, pages 208-219, 2018. URL:
  11. Eva Darulova and Viktor Kuncak. Sound compilation of reals. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '14, San Diego, CA, USA, January 20-21, 2014, volume 49(1), pages 235-248, New York, NY, USA, January 2014. Association for Computing Machinery. URL:
  12. Scott Grauer-Gray, Lifan Xu, Robert Searles, Sudhee Ayalasomayajula, and John Cavazos. Auto-tuning a high-level language targeted to GPU codes. In Innovative Parallel Computing (InPar), 2012, pages 1-10, May 2012. Google Scholar
  13. IEEE Computer Society Standards Committee. Floating-Point Working group of the Microprocessor Standards Subcommittee. Ieee standard for floating-point arithmetic. IEEE Std 754-2008, pages 1-70, August 2008. URL:
  14. Ian Karlin, Jeff Keasler, and J Robert Neely. Lulesh 2.0 updates and changes. OSTI - Office of Scientific and Technical Information, U.S. Department of Energy, July 2013. URL:
  15. Michael O. Lam and Jeffrey K. Hollingsworth. Fine-grained floating-point precision analysis. The International Journal of High Performance Computing Applications, 32(2):231-245, 2016. URL:
  16. Michael O. Lam, Jeffrey K. Hollingsworth, Bronis R. de Supinski, and Matthew P. Legendre. Automatically adapting programs for mixed-precision floating-point computation. In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS '13, pages 369-378, New York, NY, USA, 2013. Association for Computing Machinery. URL:
  17. Michael O. Lam, Tristan Vanderbruggen, Harshitha Menon, and Markus Schordan. Tool integration for source-level mixed precision. 2019 IEEE/ACM 3rd International Workshop on Software Correctness for HPC Applications (Correctness), pages 27-35, 2019. Google Scholar
  18. Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Proc. Int'l Symp. on Code Generation and Optimization, 2004. Google Scholar
  19. Harshitha Menon, Michael O. Lam, Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger. Adapt: Algorithmic differentiation applied to floating-point precision tuning. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC '18. IEEE Press, 2018. URL:
  20. Message Passing Interface Forum (MPIF). Mpi: A message-passing interface standard. Technical report, University of Tennessee, USA, 1994. Google Scholar
  21. Konstantinos Parasyris et al. Hpc-mixpbench: An hpc benchmark suite for mixed-precision analysis. 2020 IEEE International Symposium on Workload Characterization (IISWC), pages 25-36, October 2020. URL:
  22. Louis-Noël Pouchet and Tomofumi Yuki. Polybench/C 4.2.1, 2016. URL:
  23. Cindy Rubio-González et al. Precimonious: Tuning assistant for floating-point precision. In Proc. Int'l Conf. on High Performance Computing, Networking, Storage and Analysis, SC '13, pages 27:1-27:12, November 2013. URL:
  24. Phillip Stanley-Marbell et al. Exploiting errors for efficiency: a survey from circuits to applications. ACM Computing Surveys (CSUR), 53(3):1-39, 2020. Google Scholar
  25. Andreas Zeller and Ralf Hildebrandt. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering, 28(2):183-200, February 2002. URL:
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail