DROPS

Document

DOI: 10.4230/OASIcs.Programming.2025.20

On the Effectiveness of Interpreter-Guided Compiler Testing

Authors: Federico Lochbaum and Guillermo Polito

Published in: OASIcs, Volume 134, Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)

Abstract

Guaranteeing that a compiler behaves correctly is a complex task often approached through test generation and fuzzing. Compiler test generation must not only ensure that a compiler generates code that does not break, but also that it implements the programming language semantics. Recently, interpreter-guided test generation has been proposed to test JIT compilers: Concolic-execution on the interpreter yields test cases for the language semantics which are then validated between differential testing of the interpreter and compiler. In previous work, this solution has been shown to find interpreter/compiler differences. However, little has been said about the effectiveness and the solution limits. In this paper we study the behavior of this technique, to shed light on future improvements and research. We experiment with this technique on the JIT compiler for the Pharo programming language, on two different backends: ARMv7 and x86. We explore how effective the solution is in terms of compiler coverage and its limitations, and we discuss how future research can overcome them. Moreover, we investigate how this technique combined with random constraint mutations increases backend compiler coverage.

Cite as

Federico Lochbaum and Guillermo Polito. On the Effectiveness of Interpreter-Guided Compiler Testing. In Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025). Open Access Series in Informatics (OASIcs), Volume 134, pp. 20:1-20:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{lochbaum_et_al:OASIcs.Programming.2025.20,
  author =	{Lochbaum, Federico and Polito, Guillermo},
  title =	{{On the Effectiveness of Interpreter-Guided Compiler Testing}},
  booktitle =	{Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)},
  pages =	{20:1--20:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-382-9},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{134},
  editor =	{Edwards, Jonathan and Perera, Roly and Petricek, Tomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Programming.2025.20},
  URN =		{urn:nbn:de:0030-drops-243040},
  doi =		{10.4230/OASIcs.Programming.2025.20},
  annote =	{Keywords: Virtual Machines, Concolic Testing, JIT compilers, interpreters, Differential Testing, Constraint Mutations, Compiler Coverage}
}

@InProceedings{lochbaum_et_al:OASIcs.Programming.2025.20,
  author =	{Lochbaum, Federico and Polito, Guillermo},
  title =	{{On the Effectiveness of Interpreter-Guided Compiler Testing}},
  booktitle =	{Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)},
  pages =	{20:1--20:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-382-9},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{134},
  editor =	{Edwards, Jonathan and Perera, Roly and Petricek, Tomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Programming.2025.20},
  URN =		{urn:nbn:de:0030-drops-243040},
  doi =		{10.4230/OASIcs.Programming.2025.20},
  annote =	{Keywords: Virtual Machines, Concolic Testing, JIT compilers, interpreters, Differential Testing, Constraint Mutations, Compiler Coverage}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.13

FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation

Authors: Amber Gorzynski and Alastair F. Donaldson

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

Decompilation is the process of translating compiled code into high-level code. Control flow recovery is a challenging part of the process. "Misdecompilations" can occur, whereby the decompiled code does not accurately represent the semantics of the compiled code, despite it being syntactically valid. This is problematic because it can mislead users who are trying to reason about the program. We present CFG-based program generation: a novel approach to randomised testing that aims to improve the control flow recovery of decompilers. CFG-based program generation involves randomly generating control flow graphs (CFGs) and paths through each graph. Inspired by prior work in the domain of GPU computing, (CFG, path) pairs are "fleshed" into test programs. Each program is decompiled and recompiled. The test oracle verifies whether the actual runtime path through the graph matches the expected path. Any difference in the execution paths after recompilation indicates a possible misdecompilation. A key benefit of this approach is that it is largely independent of the source and target languages in question because it is focused on control flow. The approach is therefore applicable to numerous decompilation settings. The trade-off resulting from the focus on control flow is that misdecompilation bugs that do not relate to control flow (e.g. bugs that involve specific arithmetic operations) are out of scope. We have implemented this approach in FuzzFlesh, an open-source randomised testing tool. FuzzFlesh can be easily configured to target a variety of low-level languages and decompiler toolchains because most of the CFG and path generation process is language-independent. At present, FuzzFlesh supports testing decompilation of Java bytecode, .NET assembly and x86 machine code. In addition to program generation, FuzzFlesh also includes an automated test-case reducer that operates on the CFG rather than the low-level program, which means that it can be applied to any of the target languages. We present a large experimental campaign applying FuzzFlesh to a variety of decompilers, leading to the discovery of 12 previously-unknown bugs across two language formats, six of which have been fixed. We present experiments comparing our generic FuzzFlesh tool to two state-of-the-art decompiler testing tools targeted at specific languages. As expected, the coverage our generic FuzzFlesh tool achieves on a given decompiler is lower than the coverage achieved by a tool specifically designed for the input format of that decompiler. However, due to its focus on control flow, FuzzFlesh is able to cover sections of control flow recovery code that the targeted tools cannot reach, and identify control flow related bugs that the targeted tools miss.

Cite as

Amber Gorzynski and Alastair F. Donaldson. FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 13:1-13:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gorzynski_et_al:LIPIcs.ECOOP.2025.13,
  author =	{Gorzynski, Amber and Donaldson, Alastair F.},
  title =	{{FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{13:1--13:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.13},
  URN =		{urn:nbn:de:0030-drops-233062},
  doi =		{10.4230/LIPIcs.ECOOP.2025.13},
  annote =	{Keywords: Decompiler, Reverse Engineering, Control Flow, Software Testing, Fuzzing}
}

Document

Vision

DOI: 10.4230/TGDK.1.1.10

Multilingual Knowledge Graphs and Low-Resource Languages: A Review

Authors: Lucie-Aimée Kaffee, Russa Biswas, C. Maria Keet, Edlira Kalemi Vakaj, and Gerard de Melo

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1

Abstract

There is a lack of multilingual data to support applications in a large number of languages, especially for low-resource languages. Knowledge graphs (KG) could contribute to closing the gap of language support by providing easily accessible, machine-readable, multilingual linked data, which can be reused across applications. In this paper, we provide an overview of work in the domain of multilingual KGs with a focus on low-resource languages. We review the current state of multilingual KGs along with the different aspects that are crucial for creating KGs with language coverage in mind. Special consideration is given to challenges particular to low-resource languages in KGs. We further provide an overview of applications that yield multilingual KG information as well as downstream applications reusing such multilingual data. Finally, we explore open problems regarding multilingual KGs with a focus on low-resource languages.

Cite as

Lucie-Aimée Kaffee, Russa Biswas, C. Maria Keet, Edlira Kalemi Vakaj, and Gerard de Melo. Multilingual Knowledge Graphs and Low-Resource Languages: A Review. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 10:1-10:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@Article{kaffee_et_al:TGDK.1.1.10,
  author =	{Kaffee, Lucie-Aim\'{e}e and Biswas, Russa and Keet, C. Maria and Vakaj, Edlira Kalemi and de Melo, Gerard},
  title =	{{Multilingual Knowledge Graphs and Low-Resource Languages: A Review}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{10:1--10:19},
  ISSN =	{2942-7517},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.10},
  URN =		{urn:nbn:de:0030-drops-194845},
  doi =		{10.4230/TGDK.1.1.10},
  annote =	{Keywords: knowledge graphs, multilingual, low-resource languages, review}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2018.6

Learning to Accelerate Symbolic Execution via Code Transformation

Authors: Junjie Chen, Wenxiang Hu, Lingming Zhang, Dan Hao, Sarfraz Khurshid, and Lu Zhang

Published in: LIPIcs, Volume 109, 32nd European Conference on Object-Oriented Programming (ECOOP 2018)

Abstract

Symbolic execution is an effective but expensive technique for automated test generation. Over the years, a large number of refined symbolic execution techniques have been proposed to improve its efficiency. However, the symbolic execution efficiency problem remains, and largely limits the application of symbolic execution in practice. Orthogonal to refined symbolic execution, in this paper we propose to accelerate symbolic execution through semantic-preserving code transformation on the target programs. During the initial stage of this direction, we adopt a particular code transformation, compiler optimization, which is initially proposed to accelerate program concrete execution by transforming the source program into another semantic-preserving target program with increased efficiency (e.g., faster or smaller). However, compiler optimizations are mostly designed to accelerate program concrete execution rather than symbolic execution. Recent work also reported that unified settings on compiler optimizations that can accelerate symbolic execution for any program do not exist at all. Therefore, in this work we propose a machine-learning based approach to tuning compiler optimizations to accelerate symbolic execution, whose results may also aid further design of specific code transformations for symbolic execution. In particular, the proposed approach LEO separates source-code functions and libraries through our program-splitter, and predicts individual compiler optimization (i.e., whether a type of code transformation is chosen) separately through analyzing the performance of existing symbolic execution. Finally, LEO applies symbolic execution on the code transformed by compiler optimization (through our local-optimizer). We conduct an empirical study on GNU Coreutils programs using the KLEE symbolic execution engine. The results show that LEO significantly accelerates symbolic execution, outperforming the default KLEE configurations (i.e., turning on/off all compiler optimizations) in various settings, e.g., with the default training/testing time, LEO achieves the highest line coverage in 50/68 programs, and its average improvement rate on all programs is 46.48%/88.92% in terms of line coverage compared with turning on/off all compiler optimizations.

Cite as

Junjie Chen, Wenxiang Hu, Lingming Zhang, Dan Hao, Sarfraz Khurshid, and Lu Zhang. Learning to Accelerate Symbolic Execution via Code Transformation. In 32nd European Conference on Object-Oriented Programming (ECOOP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 109, pp. 6:1-6:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ECOOP.2018.6,
  author =	{Chen, Junjie and Hu, Wenxiang and Zhang, Lingming and Hao, Dan and Khurshid, Sarfraz and Zhang, Lu},
  title =	{{Learning to Accelerate Symbolic Execution via Code Transformation}},
  booktitle =	{32nd European Conference on Object-Oriented Programming (ECOOP 2018)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-079-8},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{109},
  editor =	{Millstein, Todd},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2018.6},
  URN =		{urn:nbn:de:0030-drops-92115},
  doi =		{10.4230/LIPIcs.ECOOP.2018.6},
  annote =	{Keywords: Symbolic Execution, Code Transformation, Machine Learning}
}

4 Search Results for "Hu, Wenxiang"

On the Effectiveness of Interpreter-Guided Compiler Testing

Abstract

Cite as

FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation

Abstract

Cite as

Multilingual Knowledge Graphs and Low-Resource Languages: A Review

Abstract

Cite as

Learning to Accelerate Symbolic Execution via Code Transformation

Abstract

Cite as

Thanks for your feedback!

Could not send message