The Performance Effects of Virtual-Machine Instruction Pointer Updates

Authors M. Anton Ertl , Bernd Paysan



PDF
Thumbnail PDF

File

LIPIcs.ECOOP.2024.14.pdf
  • Filesize: 0.87 MB
  • 26 pages

Document Identifiers

Author Details

M. Anton Ertl
  • TU Wien, Austria
Bernd Paysan
  • net2o, Munich, Germany

Cite AsGet BibTex

M. Anton Ertl and Bernd Paysan. The Performance Effects of Virtual-Machine Instruction Pointer Updates. In 38th European Conference on Object-Oriented Programming (ECOOP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 313, pp. 14:1-14:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ECOOP.2024.14

Abstract

How much performance do VM instruction-pointer (IP) updates cost and how much benefit do we get from optimizing them away? Two decades ago it had little effect on the hardware of the day, but on recent hardware the dependence chain of IP updates can become the critical path on processors with out-of-order execution. In particular, this happens if the VM instructions are light-weight and the application programs are loop-dominated. The present work presents several ways of reducing or eliminating the dependence chains from IP updates, either by breaking the dependence chains with the loop optimization or by reducing the number of IP updates (the c and ci optimizations) or their latency (the b optimization). Some benchmarks see speedups from these optimizations by factors > 2 on most recent cores, while other benchmarks and older cores see more modest results, often in the speedup ranges 1.1-1.3.

Subject Classification

ACM Subject Classification
  • Software and its engineering → Virtual machines
  • Computer systems organization → Superscalar architectures
  • Software and its engineering → Interpreters
Keywords
  • virtual machine
  • interpreter
  • out-of-order execution

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. James R. Bell. Threaded code. Communications of the ACM, 16(6):370-372, 1973. Google Scholar
  2. Helmut Eller. Optimizing interpreters with superinstructions. Diplomarbeit, TU Wien, 2005. URL: https://www.complang.tuwien.ac.at/Diplomarbeiten/ eller05.ps.gz.
  3. M. Anton Ertl and David Gregg. Implementation issues for superinstructions in Gforth. In EuroForth 2003 Conference Proceedings, 2003. URL: https://www.complang.tuwien.ac.at/papers/ ertl%26gregg03euroforth.ps.gz.
  4. M. Anton Ertl and David Gregg. Optimizing indirect branch prediction accuracy in virtual machine interpreters. In SIGPLAN Conference on Programming Language Design and Implementation (PLDI'03), 2003. URL: https://www.complang.tuwien.ac.at/papers/ ertl%26gregg03.ps.gz.
  5. M. Anton Ertl and David Gregg. Combining stack caching with dynamic superinstructions. In Interpreters, Virtual Machines and Emulators (IVME '04), pages 7-14, 2004. URL: https://www.complang.tuwien.ac.at/papers/ ertl%26gregg04ivme.ps.gz.
  6. M. Anton Ertl and David Gregg. Retargeting JIT compilers by using C-compiler generated executable code. In Parallel Architecture and Compilation Techniques (PACT' 04), pages 41-50, 2004. URL: https://www.complang.tuwien.ac.at/papers/ ertl%26gregg04pact.ps.gz.
  7. M. Anton Ertl and David Gregg. Stack caching in Forth. In 21st EuroForth Conference, pages 6-15, 2005. URL: https://www.complang.tuwien.ac.at/papers/ ertl%26gregg05.ps.gz.
  8. M. Anton Ertl, David Gregg, Andreas Krall, and Bernd Paysan. vmgen - a generator of efficient virtual machine interpreters. Software - Practice and Experience, 32(3):265-294, 2002. URL: https://www.complang.tuwien.ac.at/papers/ertl+02.ps.gz.
  9. M. Anton Ertl and Bernd Paysan. Gforth. Software, version 0.7.9_20240821., swhId: https://archive.softwareheritage.org/swh:1:dir:61eb3b71325060fe2e01f5e819eb0bec959e5bf0;origin=https://git.savannah.gnu.org/git/gforth.git;visit=swh:1:snp:1faec00a6c15a4437d644656cc7a1f6d9cc3b878;anchor=swh:1:rev:9ea3267b29894afeda9b707899aa147c6ccb7af8 (visited on 2024-09-02). URL: https://git.savannah.gnu.org/cgit/gforth.git.
  10. M. Anton Ertl and Bernd Paysan. ip-updates. Collection, version 7. (visited on 2024-09-02). URL: https://www.complang.tuwien.ac.at/anton/ip-updates.tar.xz.
  11. David Gregg and John Waldron. Primitive sequences in general purpose Forth programs. In 18th EuroForth Conference, pages 24-32, 2002. URL: http://www.complang.tuwien.ac.at/anton/euroforth2002/papers/ gregg.ps.gz.
  12. Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer, Michael Holman, Dan Gohman, Luke Wagner, Alon Zakai, and JF Bastien. Bringing the web up to speed with WebAssembly. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, pages 185-200, New York, NY, USA, 2017. Association for Computing Machinery. URL: https://doi.org/10.1145/3062341.3062363.
  13. R. J. M. Hughes. Super-combinators. In Conference Record of the 1980 LISP Conference, Stanford, CA, pages 1-11, New York, 1982. ACM. Google Scholar
  14. Octave Larose, Sophie Kaleba, Humphrey Burchell, and Stefan Marr. AST vs. bytecode: Interpreters in the age of meta-compilation. Proc. ACM Program. Lang., 7(OOPSLA2), October 2023. URL: https://doi.org/10.1145/3622808.
  15. Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison-Wesley, first edition, 1997. Google Scholar
  16. Henrik Nässén, Mats Carlsson, and Konstantinos Sagonas. Instruction merging and specialization in the SICStus Prolog virtual machine. In Principles and Practice of Declarative Programming (PPDP01), 2001. URL: http://www.csd.uu.se/%7Ekostis/Papers/sicstus.ps.gz.
  17. Ian Piumarta and Fabio Riccardi. Optimizing direct threaded code by selective inlining. In SIGPLAN '98 Conference on Programming Language Design and Implementation, pages 291-300, 1998. URL: ftp://ftp.inria.fr/INRIA/Projects/SOR/papers/1998/ ODCSI_pldi98.ps.gz.
  18. Todd A. Proebsting. Optimizing an ANSI C interpreter with superoperators. In Principles of Programming Languages (POPL '95), pages 322-332, 1995. Google Scholar
  19. Erven Rohou, Bharath Narasimha Swamy, and André Seznec. Branch prediction and the performance of interpreters - don't trust folklore. In Code Generation and Optimization (CGO), 2015. URL: https://hal.inria.fr/hal-01100647/document.
  20. Markku Rossi and Kengatharan Sivalingam. A survey of instruction dispatch techniques for byte-code interpreters. Technical Report TKO-C79, Faculty of Information Technology, Helsinki University of Technology, May 1996. URL: http://www.cs.hut.fi/~cessu/papers/dispatch.ps.
  21. Ben L. Titzer. A fast in-place interpreter for WebAssembly. Proc. ACM Program. Lang, 6(OOPSLA2):148:1-148:27, 2022. Google Scholar
  22. Benjamin Vitale and Tarek S. Abdelrahman. Catenation and specialization for Tcl virtual machine performance. In IVME '04 Proceedings, pages 42-50, 2004. Google Scholar
  23. Christian Wimmer, Michael Haupt, Michael L. Van De Vanter, Mick Jordan, Laurent Daynès, and Douglas Simon. Maxine: An approachable virtual machine for, and in, Java. ACM Transactions on Architecture and Code Optimization, 9(4):30:1-30:24, January 2013. Google Scholar
  24. Haoran Xu and Fredrik Kjolstad. Copy-and-patch compilation. Proc. ACM Program. Lang., 5(OOPSLA):136:1-136:30, October 2021. URL: https://fredrikbk.com/publications/copy-and-patch.pdf.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail