16 Search Results for "Qian, Tao"


Document
Research
Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}
Document
Fairness and Efficiency in Two-Sided Matching Markets

Authors: Pallavi Jain, Palash Jha, and Shubham Solanki

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)


Abstract
We propose a new fairness notion, motivated by the practical challenge of allocating teaching assistants (TAs) to courses in a department. Each course requires a certain number of TAs and each TA has preferences over the courses they want to assist. Similarly, each course instructor has preferences over the TAs who applied for their course. We demand fairness and efficiency for both sides separately, giving rise to the following criteria: (i) every course gets the required number of TAs and the average utility of the assigned TAs meets a threshold; (ii) the allocation of courses to TAs is envy-free, where a TA envies another TA if the former prefers the latter’s course and has a higher or equal grade in that course. Note that the definition of envy-freeness here differs from the one in the literature, and we call it merit-based envy-freeness. We show that the problem of finding a merit-based envy-free and efficient matching is NP-hard even for very restricted settings, such as two courses and uniform valuations; constant degree, constant capacity of TAs for every course, valuations in the range {0,1,2,3}, identical valuations from TAs, and even more. To find tractable results, we consider some restricted instances, such as, strict valuation of TAs for courses, the difference between the number of positively valued TAs for a course and the capacity, the number of positively valued TAs/courses, types of valuation functions, and obtained some polynomial-time solvable cases, showing the contrast with intractable results. We further studied the problem in the paradigm of parameterized algorithms and designed some exact and approximation algorithms.

Cite as

Pallavi Jain, Palash Jha, and Shubham Solanki. Fairness and Efficiency in Two-Sided Matching Markets. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 38:1-38:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{jain_et_al:LIPIcs.FSTTCS.2025.38,
  author =	{Jain, Pallavi and Jha, Palash and Solanki, Shubham},
  title =	{{Fairness and Efficiency in Two-Sided Matching Markets}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{38:1--38:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.38},
  URN =		{urn:nbn:de:0030-drops-251186},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.38},
  annote =	{Keywords: Fair Matching, Envy-Freeness, Efficiency}
}
Document
Survey
Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}
Document
Multidimensional Blockchain Fees Are (Essentially) Optimal

Authors: Guillermo Angeris, Theo Diamandis, and Ciamac Moallemi

Published in: LIPIcs, Volume 354, 7th Conference on Advances in Financial Technologies (AFT 2025)


Abstract
In this paper we show that, using only mild assumptions, dynamic multidimensional blockchain fee markets have strong performance guarantees, even against worst-case adversaries. In particular, we show that the average welfare gap between the following two scenarios is at most O(1/√T), where T is the length of the time horizon considered. In the first scenario, the designer knows all future actions by users and is allowed to fix the optimal prices of resources ahead of time, based on the designer’s oracular knowledge of those actions. In the second, the prices are updated by a very simple algorithm that does not have this oracular knowledge, special cases of which are EIP-4844 and EIP-1559, both fee mechanisms used by the Ethereum blockchain. Roughly speaking, this means that, on average, over a reasonable timescale, there is no difference in welfare between "correctly" fixing the prices, with oracular knowledge of the future, when compared to the proposed algorithm. We show a matching lower bound of Ω(1/√T) for any implementable algorithm and also separately consider the case where the adversary is known to be stochastic.

Cite as

Guillermo Angeris, Theo Diamandis, and Ciamac Moallemi. Multidimensional Blockchain Fees Are (Essentially) Optimal. In 7th Conference on Advances in Financial Technologies (AFT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 354, pp. 24:1-24:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{angeris_et_al:LIPIcs.AFT.2025.24,
  author =	{Angeris, Guillermo and Diamandis, Theo and Moallemi, Ciamac},
  title =	{{Multidimensional Blockchain Fees Are (Essentially) Optimal}},
  booktitle =	{7th Conference on Advances in Financial Technologies (AFT 2025)},
  pages =	{24:1--24:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-400-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{354},
  editor =	{Avarikioti, Zeta and Christin, Nicolas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AFT.2025.24},
  URN =		{urn:nbn:de:0030-drops-247433},
  doi =		{10.4230/LIPIcs.AFT.2025.24},
  annote =	{Keywords: Blockchains, transaction fees, online optimization, convex optimization}
}
Document
Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization

Authors: Ze Gao, Yuan Zhuang, Kunqi Wang, and Mengyao Guo

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)


Abstract
As humanity advances toward long-term space habitation, traditional SHM systems - reliant on abstract data representations - struggle to support rapid decision-making in extreme environments. This study addresses this critical gap by introducing an engineering-art-human factors framework that transforms SHM through immersive data-art visualization. By integrating sensor networks and machine learning, structural data (stress, vibration, deformation) is converted into intuitive visual languages: dynamic color gradients and biomimetic morphologies leverage perceptual laws (e.g., Weber-Fechner) to amplify critical signals. Multimodal interfaces (AR, haptic feedback) and natural elements mitigate cognitive load and psychological stress in confined habitats. Our contribution lies in redefining SHM as a synergy of precision and intuition, enabling "at-a-glance" assessments while balancing functionality and human-centric design. The urgency of this research stems from the inadequacy of conventional systems in extreme space conditions and the growing demand for astronaut safety and operational efficiency. This framework not only pioneers a sustainable monitoring paradigm for space habitats but also extends to terrestrial high-risk infrastructure, demonstrating the necessity of interdisciplinary innovation in extreme environments.

Cite as

Ze Gao, Yuan Zhuang, Kunqi Wang, and Mengyao Guo. Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 31:1-31:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{gao_et_al:OASIcs.SpaceCHI.2025.31,
  author =	{Gao, Ze and Zhuang, Yuan and Wang, Kunqi and Guo, Mengyao},
  title =	{{Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{31:1--31:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.31},
  URN =		{urn:nbn:de:0030-drops-240217},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.31},
  annote =	{Keywords: Structural health monitoring, space habitats, immersive visualization, human-centered design, interdisciplinary innovation}
}
Document
Estimation of Substitution and Indel Rates via k-mer Statistics

Authors: Mahmudur Rahman Hera, Paul Medvedev, David Koslicki, and Antonio Blanca

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Methods utilizing k-mers are widely used in bioinformatics, yet our understanding of their statistical properties under realistic mutation models remains incomplete. Previously, substitution-only mutation models have been considered to derive precise expectations and variances for mutated k-mers and intervals of mutated and non-mutated sequences. In this work, we consider a mutation model that incorporates insertions and deletions in addition to single-nucleotide substitutions. Within this framework, we derive closed-form k-mer-based estimators for the three fundamental mutation parameters: substitution, deletion rate, and insertion rates. We provide theoretical guarantees in the form of concentration inequalities, ensuring accuracy of our estimators under reasonable model assumptions. Empirical evaluations on simulated evolution of genomic sequences confirm our theoretical findings, demonstrating that accounting for insertions and deletions signals allows for accurate estimation of mutation rates and improves upon the results obtained by considering a substitution-only model. An implementation of estimating the mutation parameters from a pair of fasta files is available here: https://github.com/KoslickiLab/estimate_rates_using_mutation_model.git. The results presented in this manuscript can be reproduced using the code available here: https://github.com/KoslickiLab/est_rates_experiments.git.

Cite as

Mahmudur Rahman Hera, Paul Medvedev, David Koslicki, and Antonio Blanca. Estimation of Substitution and Indel Rates via k-mer Statistics. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 16:1-16:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{rahmanhera_et_al:LIPIcs.WABI.2025.16,
  author =	{Rahman Hera, Mahmudur and Medvedev, Paul and Koslicki, David and Blanca, Antonio},
  title =	{{Estimation of Substitution and Indel Rates via k-mer Statistics}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{16:1--16:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.16},
  URN =		{urn:nbn:de:0030-drops-239422},
  doi =		{10.4230/LIPIcs.WABI.2025.16},
  annote =	{Keywords: k-mers, mutation rate, indel, alignment-free, estimation, substitution, insertion, deletion}
}
Document
FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation

Authors: Amber Gorzynski and Alastair F. Donaldson

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
Decompilation is the process of translating compiled code into high-level code. Control flow recovery is a challenging part of the process. "Misdecompilations" can occur, whereby the decompiled code does not accurately represent the semantics of the compiled code, despite it being syntactically valid. This is problematic because it can mislead users who are trying to reason about the program. We present CFG-based program generation: a novel approach to randomised testing that aims to improve the control flow recovery of decompilers. CFG-based program generation involves randomly generating control flow graphs (CFGs) and paths through each graph. Inspired by prior work in the domain of GPU computing, (CFG, path) pairs are "fleshed" into test programs. Each program is decompiled and recompiled. The test oracle verifies whether the actual runtime path through the graph matches the expected path. Any difference in the execution paths after recompilation indicates a possible misdecompilation. A key benefit of this approach is that it is largely independent of the source and target languages in question because it is focused on control flow. The approach is therefore applicable to numerous decompilation settings. The trade-off resulting from the focus on control flow is that misdecompilation bugs that do not relate to control flow (e.g. bugs that involve specific arithmetic operations) are out of scope. We have implemented this approach in FuzzFlesh, an open-source randomised testing tool. FuzzFlesh can be easily configured to target a variety of low-level languages and decompiler toolchains because most of the CFG and path generation process is language-independent. At present, FuzzFlesh supports testing decompilation of Java bytecode, .NET assembly and x86 machine code. In addition to program generation, FuzzFlesh also includes an automated test-case reducer that operates on the CFG rather than the low-level program, which means that it can be applied to any of the target languages. We present a large experimental campaign applying FuzzFlesh to a variety of decompilers, leading to the discovery of 12 previously-unknown bugs across two language formats, six of which have been fixed. We present experiments comparing our generic FuzzFlesh tool to two state-of-the-art decompiler testing tools targeted at specific languages. As expected, the coverage our generic FuzzFlesh tool achieves on a given decompiler is lower than the coverage achieved by a tool specifically designed for the input format of that decompiler. However, due to its focus on control flow, FuzzFlesh is able to cover sections of control flow recovery code that the targeted tools cannot reach, and identify control flow related bugs that the targeted tools miss.

Cite as

Amber Gorzynski and Alastair F. Donaldson. FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 13:1-13:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{gorzynski_et_al:LIPIcs.ECOOP.2025.13,
  author =	{Gorzynski, Amber and Donaldson, Alastair F.},
  title =	{{FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{13:1--13:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.13},
  URN =		{urn:nbn:de:0030-drops-233062},
  doi =		{10.4230/LIPIcs.ECOOP.2025.13},
  annote =	{Keywords: Decompiler, Reverse Engineering, Control Flow, Software Testing, Fuzzing}
}
Document
Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories

Authors: Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
Functionality-specific vulnerabilities, which mainly occur in Application Programming Interfaces (APIs) with specific functionalities, are crucial for software developers to detect and avoid. When detecting individual functionality-specific vulnerabilities, the existing two categories of approaches are ineffective because they consider only the API bodies and are unable to handle diverse implementations of functionality-equivalent APIs. To effectively detect functionality-specific vulnerabilities, we propose APISS, the first approach to utilize API doc strings and signatures instead of API bodies. APISS first retrieves functionality-equivalent APIs for APIs with existing vulnerabilities and then migrates Proof-of-Concepts (PoCs) of the existing vulnerabilities for newly detected vulnerable APIs. To retrieve functionality-equivalent APIs, we leverage a Large Language Model for API embedding to improve the accuracy and address the effectiveness and scalability issues suffered by the existing approaches. To migrate PoCs of the existing vulnerabilities for newly detected vulnerable APIs, we design a semi-automatic schema to substantially reduce manual costs. We conduct a comprehensive evaluation to empirically compare APISS with four state-of-the-art approaches of detecting vulnerabilities and two state-of-the-art approaches of retrieving functionality-equivalent APIs. The evaluation subjects include 180 widely used Java repositories using 10 existing vulnerabilities, along with their PoCs. The results show that APISS effectively retrieves functionality-equivalent APIs, achieving a Top-1 Accuracy of 0.81 while the best of the baselines under comparison achieves only 0.55. APISS is highly efficient: the manual costs are within 10 minutes per vulnerability and the end-to-end runtime overhead of testing one candidate API is less than 2 hours. APISS detects 179 new vulnerabilities and receives 60 new CVE IDs, bringing high value to security practice.

Cite as

Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie. Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 6:1-6:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ECOOP.2025.6,
  author =	{Chen, Tianyu and Wang, Zeyu and Li, Lin and Li, Ding and Li, Zongyang and Chang, Xiaoning and Bian, Pan and Liang, Guangtai and Wang, Qianxiang and Xie, Tao},
  title =	{{Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.6},
  URN =		{urn:nbn:de:0030-drops-232999},
  doi =		{10.4230/LIPIcs.ECOOP.2025.6},
  annote =	{Keywords: Application Security, Vulnerability Detection, Large Language Model}
}
Document
Replication Paper
Scaling Up: Revisiting Mining Android Sandboxes at Scale for Malware Classification (Replication Paper)

Authors: Francisco Handrick Tomaz da Costa, Ismael Medeiros, Leandro Oliveira, João Calássio, Rodrigo Bonifácio, Krishna Narasimhan, Mira Mezini, and Márcio Ribeiro

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
The widespread use of smartphones in daily life has raised concerns about privacy and security among researchers and practitioners. Privacy issues are generally highly prevalent in mobile applications, particularly targeting the Android platform - the most popular mobile operating system. For this reason, several techniques have been proposed to identify malicious behavior in Android applications, including the Mining Android Sandbox approach (MAS approach), which aims to identify malicious behavior in repackaged Android applications (apps). However, previous empirical studies evaluated the MAS approach using a small dataset consisting of only 102 pairs of original and repackaged apps. This limitation raises questions about the external validity of their findings and whether the MAS approach can be generalized to larger datasets. To address these concerns, this paper presents the results of a replication study focused on evaluating the performance of the MAS approach regarding its capabilities of correctly classifying malware from different families. Unlike previous studies, our research employs a dataset that is an order of magnitude larger, comprising 4,076 pairs of apps covering a more diverse range of Android malware families. Surprisingly, our findings indicate a poor performance of the MAS approach for identifying malware, with the F1-score decreasing from 0.90 for the small dataset used in the previous studies to 0.54 in our more extensive dataset. Upon closer examination, we discovered that certain malware families partially account for the low accuracy of the MAS approach, which fails to classify a repackaged version of an app as malware correctly. Our findings highlight the limitations of the MAS approach, particularly when scaled, and underscore the importance of complementing it with other techniques to detect a broader range of malware effectively. This opens avenues for further discussion on addressing the blind spots that affect the accuracy of the MAS approach.

Cite as

Francisco Handrick Tomaz da Costa, Ismael Medeiros, Leandro Oliveira, João Calássio, Rodrigo Bonifácio, Krishna Narasimhan, Mira Mezini, and Márcio Ribeiro. Scaling Up: Revisiting Mining Android Sandboxes at Scale for Malware Classification (Replication Paper). In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 40:1-40:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{handricktomazdacosta_et_al:LIPIcs.ECOOP.2025.40,
  author =	{Handrick Tomaz da Costa, Francisco and Medeiros, Ismael and Oliveira, Leandro and Cal\'{a}ssio, Jo\~{a}o and Bonif\'{a}cio, Rodrigo and Narasimhan, Krishna and Mezini, Mira and Ribeiro, M\'{a}rcio},
  title =	{{Scaling Up: Revisiting Mining Android Sandboxes at Scale for Malware Classification}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{40:1--40:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.40},
  URN =		{urn:nbn:de:0030-drops-233320},
  doi =		{10.4230/LIPIcs.ECOOP.2025.40},
  annote =	{Keywords: Android Malware Detection, Dynamic Analysis, Mining Android Sandboxes}
}
Document
Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins

Authors: Kyle Deeds and Timo Camillo Merkl

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)


Abstract
In the last decade, various works have used statistics on relations to improve both the theory and practice of conjunctive query execution. Starting with the AGM bound which took advantage of relation sizes, later works incorporated statistics like functional dependencies and degree constraints. Each new statistic prompted work along two lines; bounding the size of conjunctive query outputs and worst-case optimal join algorithms. In this work, we continue in this vein by introducing a new statistic called a partition constraint. This statistic captures latent structure within relations by partitioning them into sub-relations which each have much tighter degree constraints. We show that this approach can both refine existing cardinality bounds and improve existing worst-case optimal join algorithms.

Cite as

Kyle Deeds and Timo Camillo Merkl. Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 17:1-17:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{deeds_et_al:LIPIcs.ICDT.2025.17,
  author =	{Deeds, Kyle and Merkl, Timo Camillo},
  title =	{{Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{17:1--17:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.17},
  URN =		{urn:nbn:de:0030-drops-229588},
  doi =		{10.4230/LIPIcs.ICDT.2025.17},
  annote =	{Keywords: Worst-Case Optimal Joins, Cardinality Bounds, Degeneracy, Degree Constraints, Partition Constraints}
}
Document
Quantum Simultaneous Protocols Without Public Coins Using Modified Equality Queries

Authors: François Le Gall, Oran Nadler, Harumichi Nishimura, and Rotem Oshman

Published in: LIPIcs, Volume 324, 28th International Conference on Principles of Distributed Systems (OPODIS 2024)


Abstract
In this paper we study a quantum version of the multiparty simultaneous message-passing (SMP) model, and we show that in some cases, quantum communication can replace public randomness, even with no entanglement between the parties. This was already known for two players, but not for more than two players, and indeed, so far all that was known was a negative result. Our main technical contribution is a compiler that takes any classical public-coin simultaneous protocol based on "modified equality queries," and converts it into a quantum simultaneous protocol without public coins with roughly the same communication complexity. We then use our compiler to derive protocols for several problems, including frequency moments, neighborhood diversity, enumeration of isolated cliques, and more.

Cite as

François Le Gall, Oran Nadler, Harumichi Nishimura, and Rotem Oshman. Quantum Simultaneous Protocols Without Public Coins Using Modified Equality Queries. In 28th International Conference on Principles of Distributed Systems (OPODIS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 324, pp. 34:1-34:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{legall_et_al:LIPIcs.OPODIS.2024.34,
  author =	{Le Gall, Fran\c{c}ois and Nadler, Oran and Nishimura, Harumichi and Oshman, Rotem},
  title =	{{Quantum Simultaneous Protocols Without Public Coins Using Modified Equality Queries}},
  booktitle =	{28th International Conference on Principles of Distributed Systems (OPODIS 2024)},
  pages =	{34:1--34:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-360-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{324},
  editor =	{Bonomi, Silvia and Galletta, Letterio and Rivi\`{e}re, Etienne and Schiavoni, Valerio},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2024.34},
  URN =		{urn:nbn:de:0030-drops-225701},
  doi =		{10.4230/LIPIcs.OPODIS.2024.34},
  annote =	{Keywords: SMP model, multi-party communication, quantum distributed algorithms}
}
Document
Fishing Fort: A System for Graph Analytics with ML Prediction and Logic Deduction

Authors: Wenfei Fan and Shuhao Liu

Published in: OASIcs, Volume 119, The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen (2024)


Abstract
This paper reports Fishing Fort, a graph analytic system developed in response to the following questions. What practical value can we get out of graph analytics? How can we effectively deduce the value from a real-life graph? Where can we get clean graphs to make accurate analyses possible? To answer these questions, Fishing Fort advocates to unify logic deduction and ML prediction by proposing Graph Association Rules (GARs), a class of logic rules in which ML models can be embedded as predicates. It employs GARs to deduce graph associations, enrich graphs and clean graphs. It has been deployed in production lines and proven effective in online recommendation, drug discovery, credit risk assessment, battery manufacturing and cybersecurity, among other things.

Cite as

Wenfei Fan and Shuhao Liu. Fishing Fort: A System for Graph Analytics with ML Prediction and Logic Deduction. In The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen. Open Access Series in Informatics (OASIcs), Volume 119, pp. 6:1-6:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{fan_et_al:OASIcs.Tannen.6,
  author =	{Fan, Wenfei and Liu, Shuhao},
  title =	{{Fishing Fort: A System for Graph Analytics with ML Prediction and Logic Deduction}},
  booktitle =	{The Provenance of Elegance in Computation - Essays Dedicated to Val Tannen},
  pages =	{6:1--6:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-320-1},
  ISSN =	{2190-6807},
  year =	{2024},
  volume =	{119},
  editor =	{Amarilli, Antoine and Deutsch, Alin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Tannen.6},
  URN =		{urn:nbn:de:0030-drops-201025},
  doi =		{10.4230/OASIcs.Tannen.6},
  annote =	{Keywords: graph analytics, data cleaning, association analysis}
}
Document
Vision
Machine Learning and Knowledge Graphs: Existing Gaps and Future Research Challenges

Authors: Claudia d'Amato, Louis Mahon, Pierre Monnin, and Giorgos Stamou

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
The graph model is nowadays largely adopted to model a wide range of knowledge and data, spanning from social networks to knowledge graphs (KGs), representing a successful paradigm of how symbolic and transparent AI can scale on the World Wide Web. However, due to their unprecedented volume, they are generally tackled by Machine Learning (ML) and mostly numeric based methods such as graph embedding models (KGE) and deep neural networks (DNNs). The latter methods have been proved lately very efficient, leading the current AI spring. In this vision paper, we introduce some of the main existing methods for combining KGs and ML, divided into two categories: those using ML to improve KGs, and those using KGs to improve results on ML tasks. From this introduction, we highlight research gaps and perspectives that we deem promising and currently under-explored for the involved research communities, spanning from KG support for LLM prompting, integration of KG semantics in ML models to symbol-based methods, interpretability of ML models, and the need for improved benchmark datasets. In our opinion, such perspectives are stepping stones in an ultimate view of KGs as central assets for neuro-symbolic and explainable AI.

Cite as

Claudia d'Amato, Louis Mahon, Pierre Monnin, and Giorgos Stamou. Machine Learning and Knowledge Graphs: Existing Gaps and Future Research Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 8:1-8:35, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{damato_et_al:TGDK.1.1.8,
  author =	{d'Amato, Claudia and Mahon, Louis and Monnin, Pierre and Stamou, Giorgos},
  title =	{{Machine Learning and Knowledge Graphs: Existing Gaps and Future Research Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{8:1--8:35},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.8},
  URN =		{urn:nbn:de:0030-drops-194824},
  doi =		{10.4230/TGDK.1.1.8},
  annote =	{Keywords: Graph-based Learning, Knowledge Graph Embeddings, Large Language Models, Explainable AI, Knowledge Graph Completion \& Curation}
}
Document
Position
Large Language Models and Knowledge Graphs: Opportunities and Challenges

Authors: Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Large Language Models (LLMs) have taken Knowledge Representation - and the world - by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.

Cite as

Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux. Large Language Models and Knowledge Graphs: Opportunities and Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 2:1-2:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{pan_et_al:TGDK.1.1.2,
  author =	{Pan, Jeff Z. and Razniewski, Simon and Kalo, Jan-Christoph and Singhania, Sneha and Chen, Jiaoyan and Dietze, Stefan and Jabeen, Hajira and Omeliyanenko, Janna and Zhang, Wen and Lissandrini, Matteo and Biswas, Russa and de Melo, Gerard and Bonifati, Angela and Vakaj, Edlira and Dragoni, Mauro and Graux, Damien},
  title =	{{Large Language Models and Knowledge Graphs: Opportunities and Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:38},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.2},
  URN =		{urn:nbn:de:0030-drops-194766},
  doi =		{10.4230/TGDK.1.1.2},
  annote =	{Keywords: Large Language Models, Pre-trained Language Models, Knowledge Graphs, Ontology, Retrieval Augmented Language Models}
}
Document
Survey
Knowledge Graph Embeddings: Open Challenges and Opportunities

Authors: Russa Biswas, Lucie-Aimée Kaffee, Michael Cochez, Stefania Dumbrava, Theis E. Jendal, Matteo Lissandrini, Vanessa Lopez, Eneldo Loza Mencía, Heiko Paulheim, Harald Sack, Edlira Kalemi Vakaj, and Gerard de Melo

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
While Knowledge Graphs (KGs) have long been used as valuable sources of structured knowledge, in recent years, KG embeddings have become a popular way of deriving numeric vector representations from them, for instance, to support knowledge graph completion and similarity search. This study surveys advances as well as open challenges and opportunities in this area. For instance, the most prominent embedding models focus primarily on structural information. However, there has been notable progress in incorporating further aspects, such as semantics, multi-modal, temporal, and multilingual features. Most embedding techniques are assessed using human-curated benchmark datasets for the task of link prediction, neglecting other important real-world KG applications. Many approaches assume a static knowledge graph and are unable to account for dynamic changes. Additionally, KG embeddings may encode data biases and lack interpretability. Overall, this study provides an overview of promising research avenues to learn improved KG embeddings that can address a more diverse range of use cases.

Cite as

Russa Biswas, Lucie-Aimée Kaffee, Michael Cochez, Stefania Dumbrava, Theis E. Jendal, Matteo Lissandrini, Vanessa Lopez, Eneldo Loza Mencía, Heiko Paulheim, Harald Sack, Edlira Kalemi Vakaj, and Gerard de Melo. Knowledge Graph Embeddings: Open Challenges and Opportunities. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 4:1-4:32, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{biswas_et_al:TGDK.1.1.4,
  author =	{Biswas, Russa and Kaffee, Lucie-Aim\'{e}e and Cochez, Michael and Dumbrava, Stefania and Jendal, Theis E. and Lissandrini, Matteo and Lopez, Vanessa and Menc{\'\i}a, Eneldo Loza and Paulheim, Heiko and Sack, Harald and Vakaj, Edlira Kalemi and de Melo, Gerard},
  title =	{{Knowledge Graph Embeddings: Open Challenges and Opportunities}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:32},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.4},
  URN =		{urn:nbn:de:0030-drops-194783},
  doi =		{10.4230/TGDK.1.1.4},
  annote =	{Keywords: Knowledge Graphs, KG embeddings, Link prediction, KG applications}
}
  • Refine by Type
  • 16 Document/PDF
  • 14 Document/HTML

  • Refine by Publication Year
  • 11 2025
  • 1 2024
  • 3 2023
  • 1 2017

  • Refine by Author
  • 2 Biswas, Russa
  • 2 Lissandrini, Matteo
  • 2 de Melo, Gerard
  • 1 Angeris, Guillermo
  • 1 Bernstein, Abraham
  • Show More...

  • Refine by Series/Journal
  • 9 LIPIcs
  • 2 OASIcs
  • 5 TGDK

  • Refine by Classification
  • 2 Computing methodologies → Semantic networks
  • 1 Applied computing → Computational biology
  • 1 Computing methodologies → Artificial intelligence
  • 1 Computing methodologies → Information extraction
  • 1 Computing methodologies → Knowledge representation and reasoning
  • Show More...

  • Refine by Keyword
  • 3 Knowledge Graphs
  • 3 Large Language Models
  • 1 Android Malware Detection
  • 1 Application Security
  • 1 Argument Mining
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail