Dynamic programming languages such as Python offer expressive power and programmer productivity at the expense of performance. Although the topic of optimizing Python has received considerable attention over the years, a key obstacle remains elusive: C extensions. Time and again, optimized run-time environments, such as JIT compilers and optimizing interpreters, fall short of optimizing across C extensions, as they cannot reason about the native code hiding underneath. To bridge this gap, we present an analysis of C extensions for Python. The analysis data indicates that C extensions come in different varieties. One such variety is to merely speed up a single thing, such as reading a file and processing it directly in C. Another variety offers broad access through an API, resulting in a domain-specific language realized by function calls. While the former variety of C extensions offer little optimization potential for optimizing run-times, we find that the latter variety does offer considerable optimization potential. This optimization potential rests on dynamic locality that C extensions cannot readily tap. We introduce a new, interpreter-based optimization leveraging this untapped optimization potential called Cross-Module Quickening. The key idea is that C extensions can use an optimization interface to register highly-optimized operations on C extension-specific datatypes. A quickening interpreter uses these information to continuously specialize programs with C extensions. To quantify the attainable performance potential of going beyond C extensions, we demonstrate a concrete instantiation of Cross-Module Quickening for the CPython interpreter and the popular NumPy C extension. We evaluate our implementation with the NPBench benchmark suite and report performance improvements by a factor of up to 2.84.
@InProceedings{berlakovich_et_al:LIPIcs.ECOOP.2024.6, author = {Berlakovich, Felix and Brunthaler, Stefan}, title = {{Cross Module Quickening - The Curious Case of C Extensions}}, booktitle = {38th European Conference on Object-Oriented Programming (ECOOP 2024)}, pages = {6:1--6:29}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-341-6}, ISSN = {1868-8969}, year = {2024}, volume = {313}, editor = {Aldrich, Jonathan and Salvaneschi, Guido}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2024.6}, URN = {urn:nbn:de:0030-drops-208557}, doi = {10.4230/LIPIcs.ECOOP.2024.6}, annote = {Keywords: interpreter, optimizations, C extensions, Python} }
Feedback for Dagstuhl Publishing