DROPS

Document

DOI: 10.4230/LIPIcs.STACS.2026.73

Colouring Probe H-Free Graphs

Authors: Daniël Paulusma, Johannes Rauch, and Erik Jan van Leeuwen

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

The NP-complete problems Colouring and k-Colouring (k ≥ 3) are well studied on H-free graphs, i.e., graphs that do not contain some fixed graph H as an induced subgraph. We research to what extent the known polynomial-time algorithms for H-free graphs can be generalized if we only know some of the edges of the input graph. We do this by considering the classical probe graph model introduced in the early nineties. For a graph H, a partitioned probe H-free graph (G,P,N) consists of a graph G = (V,E), together with a set P ⊆ V of probes and an independent set N = V ⧵ P of non-probes, such that G+F is H-free for some edge set F ⊆ binom(N,2). We show the following: - We fully classify Colouring on partitioned probe H-free graphs and show that the obtained complexity dichotomy differs from the known dichotomy of Colouring for H-free graphs. - We fully classify 3-Colouring on partitioned probe P_t-free graphs: we prove polynomial-time solvability for t ≤ 5 and NP-completeness for t ≥ 6. In contrast, 3-Colouring on P_t-free graphs is known to be polynomial-time solvable for t ≤ 7 and quasi-polynomial-time solvable for t ≥ 8. Our main result is our polynomial-time algorithm for 3-Colouring on partitioned P₅-free graphs. For this result, and also for all our other polynomial-time results, we do not need to know the edge set F; we only need to know its existence. Moreover, the class of probe P₅-free graphs includes not only paths of arbitrary length but even all bipartite graphs and is much richer than the class of P₅-free graphs. The latter is also evidenced by the fact that there exist graph problems, such as Matching Cut, that are known to be polynomial-time solvable for P₅-free graphs but NP-complete for partitioned probe P₅-free graphs. In particular, unlike the class of 3-colourable P₅-free graphs, the class of 3-colourable probe P₅-free graphs has unbounded mim-width. Hence, our polynomial-time result for 3-Colouring for probe P₅-free graphs suggests that there may be another, deeper overarching reason why 3-Colouring is polynomial-time solvable for P₅-free graphs.

Cite as

Daniël Paulusma, Johannes Rauch, and Erik Jan van Leeuwen. Colouring Probe H-Free Graphs. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 73:1-73:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{paulusma_et_al:LIPIcs.STACS.2026.73,
  author =	{Paulusma, Dani\"{e}l and Rauch, Johannes and van Leeuwen, Erik Jan},
  title =	{{Colouring Probe H-Free Graphs}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{73:1--73:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.73},
  URN =		{urn:nbn:de:0030-drops-255621},
  doi =		{10.4230/LIPIcs.STACS.2026.73},
  annote =	{Keywords: colouring, probe graph, forbidden induced subgraph, complexity dichotomy}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2026.50

On the PTAS Complexity of Multidimensional Knapsack

Authors: Ilan Doron-Arad, Ariel Kulik, and Pasin Manurangsi

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)

Abstract

We study the d-dimensional knapsack problem. We are given a set of items, each with a d-dimensional cost vector and a profit, along with a d-dimensional budget vector. The goal is to select a set of items that do not exceed the budget in all dimensions and maximize the total profit. A polynomial-time approximation scheme (PTAS) with running time n^{Θ(d/{ε})} has long been known for this problem, where {ε} is the error parameter and n is the encoding size. Despite decades of active research, the best running time of a PTAS has remained O(n^{⌈ d/{ε} ⌉ - d}). Unfortunately, existing lower bounds only cover the special case with two dimensions d = 2, and do not answer whether there is a n^{o(d/({ε)})}-time PTAS for larger values of d. In this work, we show that the running times of the best-known PTAS cannot be improved up to a polylogarithmic factor assuming the Exponential Time Hypothesis (ETH). Our techniques are based on a robust reduction from 2-CSP, which embeds 2-CSP constraints into a desired number of dimensions. Then, using a recent result of [Bafna Karthik and Minzer, STOC'25], we succeed in exhibiting tight trade-off between d and {ε} for all regimes of the parameters assuming d is sufficiently large. Informally, our result also shows that under ETH, for any function f there is no f(d/({ε)}) ⋅ n^{õ(d/({ε)})}-time (1-{ε})-approximation for d-dimensional knapsack, where n is the number of items and õ hides polylogarithmic factors in d/({ε)}.

Cite as

Ilan Doron-Arad, Ariel Kulik, and Pasin Manurangsi. On the PTAS Complexity of Multidimensional Knapsack. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 50:1-50:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{doronarad_et_al:LIPIcs.ITCS.2026.50,
  author =	{Doron-Arad, Ilan and Kulik, Ariel and Manurangsi, Pasin},
  title =	{{On the PTAS Complexity of Multidimensional Knapsack}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{50:1--50:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.50},
  URN =		{urn:nbn:de:0030-drops-253377},
  doi =		{10.4230/LIPIcs.ITCS.2026.50},
  annote =	{Keywords: d-dimensional Knapsack, Multidimensional Knapsack, PTAS, CSP}
}

Document

Research

DOI: 10.4230/TGDK.3.3.4

Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3

Abstract

In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2025.17

Cutoff for the Swendsen–Wang Dynamics on the Complete Graph

Authors: Antonio Blanca and Zhezheng Song

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)

Abstract

We study the speed of convergence of the Swendsen-Wang (SW) dynamics for the q-state ferromagnetic Potts model on the n-vertex complete graph, known as the mean-field model. The SW dynamics was introduced as an attractive alternative to the local Glauber dynamics, often offering faster convergence rates to stationarity in a variety of settings. A series of works have characterized the asymptotic behavior of the speed of convergence of the mean-field SW dynamics for all q ≥ 2 and all values of the inverse temperature parameter β > 0. In particular, it is known that when β > q the mixing time of the SW dynamics is Θ(log n). We strengthen this result by showing that for all β > q, there exists a constant c(β,q) > 0 such that the mixing time of the SW dynamics is c(β,q) log n + Θ(1). This implies that the mean-field SW dynamics exhibits the cutoff phenomenon in this temperature regime, demonstrating that this Markov chain undergoes a sharp transition from "far from stationarity" to "well-mixed" within a narrow Θ(1) time window. The presence of cutoff is algorithmically significant, as simulating the chain for fewer steps than its mixing time could lead to highly biased samples.

Cite as

Antonio Blanca and Zhezheng Song. Cutoff for the Swendsen–Wang Dynamics on the Complete Graph. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 17:1-17:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{blanca_et_al:LIPIcs.FSTTCS.2025.17,
  author =	{Blanca, Antonio and Song, Zhezheng},
  title =	{{Cutoff for the Swendsen–Wang Dynamics on the Complete Graph}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{17:1--17:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.17},
  URN =		{urn:nbn:de:0030-drops-250987},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.17},
  annote =	{Keywords: Markov chains, mixing times, cutoff phenomenon, Potts model, mean-field}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2025.4

Finding d-Cuts in Claw-Free Graphs

Authors: Jungho Ahn, Tala Eagling-Vose, Felicia Lucke, Daniël Paulusma, and Siani Smith

Published in: LIPIcs, Volume 359, 36th International Symposium on Algorithms and Computation (ISAAC 2025)

Abstract

The Matching Cut problem is to decide if the vertex set of a connected graph can be partitioned into two non-empty sets B and R such that the edges between B and R form a matching, that is, every vertex in B has at most one neighbour in R, and vice versa. If for some integer d ≥ 1, we allow every vertex in B to have at most d neighbours in R, and vice versa, we obtain the more general problem d-Cut. It is known that d-Cut is NP-complete for every d ≥ 1. However, for claw-free graphs, it is only known that d-Cut is polynomial-time solvable for d = 1 and NP-complete for d ≥ 3. We resolve the missing case d = 2 by proving NP-completeness. This follows from our more general study, in which we also bound the maximum degree. That is, we prove that for every d ≥ 2, d-Cut, restricted to claw-free graphs of maximum degree p, is constant-time solvable if p ≤ 2d+1 and NP-complete if p ≥ 2d+3. Moreover, in the former case, we can find a d-cut in linear time. We also show how our positive results for claw-free graphs can be generalized to S_{1^t,𝓁}-free graphs where S_{1^t,𝓁} is the graph obtained from a star on t+2 vertices by subdividing one of its edges exactly 𝓁 times.

Cite as

Jungho Ahn, Tala Eagling-Vose, Felicia Lucke, Daniël Paulusma, and Siani Smith. Finding d-Cuts in Claw-Free Graphs. In 36th International Symposium on Algorithms and Computation (ISAAC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 359, pp. 4:1-4:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ahn_et_al:LIPIcs.ISAAC.2025.4,
  author =	{Ahn, Jungho and Eagling-Vose, Tala and Lucke, Felicia and Paulusma, Dani\"{e}l and Smith, Siani},
  title =	{{Finding d-Cuts in Claw-Free Graphs}},
  booktitle =	{36th International Symposium on Algorithms and Computation (ISAAC 2025)},
  pages =	{4:1--4:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-408-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{359},
  editor =	{Chen, Ho-Lin and Hon, Wing-Kai and Tsai, Meng-Tsung},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2025.4},
  URN =		{urn:nbn:de:0030-drops-249121},
  doi =		{10.4230/LIPIcs.ISAAC.2025.4},
  annote =	{Keywords: matching cut, d-cut, claw-free, maximum degree}
}

Document

DOI: 10.4230/OASIcs.DX.2025.12

GEMMA-FD: Zero-Shot Fault Detection in Heat Pumps Using Multimodal Language Models

Authors: Herbert Muehlburger and Franz Wotawa

Published in: OASIcs, Volume 136, 36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025)

Abstract

Fault detection in heating systems is critical for ensuring energy efficiency and operational reliability. Traditional approaches rely on labeled fault data and expert-defined rules, which are often unavailable or costly to obtain. We introduce GEMMA-FD (GEMMA for Fault Detection), a novel zero-shot framework for fault detection in heat pumps that leverages large language models (LLMs) without requiring labeled anomalies or predefined fault signatures. Our method transforms multivariate sensor time series into structured natural language prompts and augments them with visual features, such as line plots of key variables, to facilitate multimodal reasoning. Using GEMMA-3, an open-weight multimodal LLM, we classify heat pump system states as either normal or faulty. Experiments on a real-world heat pump dataset show that GEMMA-FD can identify unseen faults with reasonable precision, although its performance remains lower than a supervised XGBoost baseline trained on the same prompts. Specifically, GEMMA-FD achieves a macro-F1 score of 0.252, compared to 0.69 for XGBoost, underscoring the trade-off between generalization and targeted accuracy. Nevertheless, GEMMA-FD demonstrates the potential of foundation models for interpretable, multilingual fault detection in cyber-physical systems, while highlighting the need for prompt engineering, few-shot augmentation, and multimodal inputs to improve the classification of rare and complex fault types.

Cite as

Herbert Muehlburger and Franz Wotawa. GEMMA-FD: Zero-Shot Fault Detection in Heat Pumps Using Multimodal Language Models. In 36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025). Open Access Series in Informatics (OASIcs), Volume 136, pp. 12:1-12:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{muehlburger_et_al:OASIcs.DX.2025.12,
  author =	{Muehlburger, Herbert and Wotawa, Franz},
  title =	{{GEMMA-FD: Zero-Shot Fault Detection in Heat Pumps Using Multimodal Language Models}},
  booktitle =	{36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025)},
  pages =	{12:1--12:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-394-2},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{136},
  editor =	{Quinones-Grueiro, Marcos and Biswas, Gautam and Pill, Ingo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.DX.2025.12},
  URN =		{urn:nbn:de:0030-drops-248015},
  doi =		{10.4230/OASIcs.DX.2025.12},
  annote =	{Keywords: fault detection, anomaly detection, cyber-physical systems, HVAC, heat pumps, energy systems, large language models, zero-shot learning, open-weight LLMs, interpretable AI, multimodal prompts, smart energy}
}

@InProceedings{muehlburger_et_al:OASIcs.DX.2025.12,
  author =	{Muehlburger, Herbert and Wotawa, Franz},
  title =	{{GEMMA-FD: Zero-Shot Fault Detection in Heat Pumps Using Multimodal Language Models}},
  booktitle =	{36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025)},
  pages =	{12:1--12:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-394-2},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{136},
  editor =	{Quinones-Grueiro, Marcos and Biswas, Gautam and Pill, Ingo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.DX.2025.12},
  URN =		{urn:nbn:de:0030-drops-248015},
  doi =		{10.4230/OASIcs.DX.2025.12},
  annote =	{Keywords: fault detection, anomaly detection, cyber-physical systems, HVAC, heat pumps, energy systems, large language models, zero-shot learning, open-weight LLMs, interpretable AI, multimodal prompts, smart energy}
}

Document

Survey

DOI: 10.4230/TGDK.3.2.1

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

DOI: 10.4230/LIPIcs.AFT.2025.18

From Permissioned to Proof-of-Stake Consensus

Authors: Jovan Komatovic, Andrew Lewis-Pye, Joachim Neu, Tim Roughgarden, and Ertem Nusret Tas

Published in: LIPIcs, Volume 354, 7th Conference on Advances in Financial Technologies (AFT 2025)

Abstract

This paper presents the first generic compiler that transforms any permissioned consensus protocol into a proof-of-stake permissionless consensus protocol. For each of the following properties, if the initial permissioned protocol satisfies that property in the partially synchronous setting, the consequent proof-of-stake protocol also satisfies that property in the partially synchronous and quasi-permissionless setting (with the same fault-tolerance): consistency; liveness; optimistic responsiveness; every composable log-specific property; and message complexity of a given order. Moreover, our transformation ensures that the output protocol satisfies accountability (identifying culprits in the event of a consistency violation), whether or not the original permissioned protocol satisfied it.

Cite as

Jovan Komatovic, Andrew Lewis-Pye, Joachim Neu, Tim Roughgarden, and Ertem Nusret Tas. From Permissioned to Proof-of-Stake Consensus. In 7th Conference on Advances in Financial Technologies (AFT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 354, pp. 18:1-18:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{komatovic_et_al:LIPIcs.AFT.2025.18,
  author =	{Komatovic, Jovan and Lewis-Pye, Andrew and Neu, Joachim and Roughgarden, Tim and Tas, Ertem Nusret},
  title =	{{From Permissioned to Proof-of-Stake Consensus}},
  booktitle =	{7th Conference on Advances in Financial Technologies (AFT 2025)},
  pages =	{18:1--18:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-400-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{354},
  editor =	{Avarikioti, Zeta and Christin, Nicolas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AFT.2025.18},
  URN =		{urn:nbn:de:0030-drops-247373},
  doi =		{10.4230/LIPIcs.AFT.2025.18},
  annote =	{Keywords: Permissioned Consensus, Proof-of-Stake, generic Compiler, Blockchain}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.22

Constructing Long Paths in Graph Streams

Authors: Christian Konrad and Chhaya Trehan

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

In the graph stream model of computation, an algorithm processes the edges of an n-vertex input graph in one or more sequential passes while using a memory that is sublinear in the input size. The streaming model poses significant challenges for algorithmically constructing long paths. Many known algorithms that are tasked with extending an existing path as a subroutine require an entire pass over the input to add a single additional edge. This raises a fundamental question: Are multiple passes inherently necessary to construct paths of non-trivial lengths, or can a single pass suffice? To address this question, we systematically study the Longest Path problem in the one-pass streaming model. In this problem, given a desired approximation factor α, the objective is to compute a path of length at least lp(G)/α, where lp(G) is the length of a longest path in the input graph G. We study the problem in the insertion-only and the insertion-deletion streaming models, and we give algorithms as well as space lower bounds for both undirected and directed graphs. Our results are: 1) We show that for undirected graphs, in both the insertion-only and the insertion-deletion models, there are semi-streaming algorithms, i.e., algorithms that use space O(n poly log n), that compute a path of length at least d/3 with high probability, where d is the average degree of the input graph. These algorithms can also yield an α-approximation to Longest Path using space Õ(n²/α). 2) Next, we show that such a result cannot be achieved for directed graphs, even in the insertion-only model. We show that computing a (n^{1-o(1)})-approximation to Longest Path in directed graphs in the insertion-only model requires space Ω(n²). This result is in line with recent results that demonstrate that processing directed graphs is often significantly harder than undirected graphs in the streaming model. 3) We further complement our results with two additional lower bounds. First, we show that semi-streaming space is insufficient for small constant factor approximations to Longest Path for undirected graphs in the insertion-only model. Last, in undirected graphs in the insertion-deletion model, we show that computing an α-approximation requires space Ω(n²/α³).

Cite as

Christian Konrad and Chhaya Trehan. Constructing Long Paths in Graph Streams. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 22:1-22:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{konrad_et_al:LIPIcs.ESA.2025.22,
  author =	{Konrad, Christian and Trehan, Chhaya},
  title =	{{Constructing Long Paths in Graph Streams}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{22:1--22:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.22},
  URN =		{urn:nbn:de:0030-drops-244902},
  doi =		{10.4230/LIPIcs.ESA.2025.22},
  annote =	{Keywords: Longest Path Problem, Streaming Algorithms, One-way Two-party Communication Complexity}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.18

Efficiency of Learned Indexes on Genome Spectra

Authors: Md. Hasin Abrar, Paul Medvedev, and Giorgio Vinciguerra

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

Data structures on a multiset of genomic k-mers are at the heart of many bioinformatic tools. As genomic datasets grow in scale, the efficiency of these data structures increasingly depends on how well they leverage the inherent patterns in the data. One recent and effective approach is the use of learned indexes that approximate the rank function of a multiset using a piecewise linear function with very few segments. However, theoretical worst-case analysis struggles to predict the practical performance of these indexes. We address this limitation by developing a novel measure of piecewise-linear approximability of the data, called CaPLa (Canonical Piecewise Linear approximability). CaPLa builds on the empirical observation that a power-law model often serves as a reasonable proxy for piecewise linear-approximability, while explicitly accounting for deviations from a true power-law fit. We prove basic properties of CaPLa and present an efficient algorithm to compute it. We then demonstrate that CaPLa can accurately predict space bounds for data structures on real data. Empirically, we analyze over 500 genomes through the lens of CaPLa, revealing that it varies widely across the tree of life and even within individual genomes. Finally, we study the robustness of CaPLa as a measure and the factors that make genomic k-mer multisets different from random ones.

Cite as

Md. Hasin Abrar, Paul Medvedev, and Giorgio Vinciguerra. Efficiency of Learned Indexes on Genome Spectra. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{abrar_et_al:LIPIcs.ESA.2025.18,
  author =	{Abrar, Md. Hasin and Medvedev, Paul and Vinciguerra, Giorgio},
  title =	{{Efficiency of Learned Indexes on Genome Spectra}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.18},
  URN =		{urn:nbn:de:0030-drops-244865},
  doi =		{10.4230/LIPIcs.ESA.2025.18},
  annote =	{Keywords: Genome spectra, piecewise linear approximation, learned index, k-mers}
}

Document

DOI: 10.4230/LIPIcs.ITP.2025.26

Improving the SMT Proof Reconstruction Pipeline in Isabelle/HOL

Authors: Hanna Lachnitt, Mathias Fleury, Haniel Barbosa, Jibiana Jakpor, Bruno Andreotti, Andrew Reynolds, Hans-Jörg Schurr, Clark Barrett, and Cesare Tinelli

Published in: LIPIcs, Volume 352, 16th International Conference on Interactive Theorem Proving (ITP 2025)

Abstract

Sledgehammer is a tool that increases the level of automation in the Isabelle/HOL proof assistant by asking external automatic theorem provers (ATPs), including SMT solvers, to prove the current goal. When the external ATP succeeds it must provide enough evidence that the goal holds for Isabelle to be able to reprove it internally based on that evidence. In particular, Isabelle can do this by replaying fine-grained proof certificates from proof-producing SMT solvers as long as they are expressed in the Alethe format, which until now was supported only by the veriT SMT solver. We report on our experience adding proof reconstruction support for the cvc5 SMT solver in Isabelle by extending cvc5 to produce proofs in the Alethe format and then adapting Isabelle to reconstruct those proofs. We discuss several difficulties and pitfalls we encountered and describe a set of tools and techniques we developed to improve the process. A notable outcome of this effort is that Isabelle can now be used as an independent proof checker for SMT problems written in the SMT-LIB standard. We evaluate cvc5’s integration on a set of SMT-LIB benchmarks originating from Isabelle as well as on a set of Isabelle proofs. Our results confirm that this integration complements and improves Sledgehammer’s capabilities.

Cite as

Hanna Lachnitt, Mathias Fleury, Haniel Barbosa, Jibiana Jakpor, Bruno Andreotti, Andrew Reynolds, Hans-Jörg Schurr, Clark Barrett, and Cesare Tinelli. Improving the SMT Proof Reconstruction Pipeline in Isabelle/HOL. In 16th International Conference on Interactive Theorem Proving (ITP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 352, pp. 26:1-26:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{lachnitt_et_al:LIPIcs.ITP.2025.26,
  author =	{Lachnitt, Hanna and Fleury, Mathias and Barbosa, Haniel and Jakpor, Jibiana and Andreotti, Bruno and Reynolds, Andrew and Schurr, Hans-J\"{o}rg and Barrett, Clark and Tinelli, Cesare},
  title =	{{Improving the SMT Proof Reconstruction Pipeline in Isabelle/HOL}},
  booktitle =	{16th International Conference on Interactive Theorem Proving (ITP 2025)},
  pages =	{26:1--26:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-396-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{352},
  editor =	{Forster, Yannick and Keller, Chantal},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2025.26},
  URN =		{urn:nbn:de:0030-drops-246243},
  doi =		{10.4230/LIPIcs.ITP.2025.26},
  annote =	{Keywords: interactive theorem proving, proof assistants, Isabelle/HOL, SMT, certification, proof certificates, proof reconstruction, proof automation}
}

@InProceedings{lachnitt_et_al:LIPIcs.ITP.2025.26,
  author =	{Lachnitt, Hanna and Fleury, Mathias and Barbosa, Haniel and Jakpor, Jibiana and Andreotti, Bruno and Reynolds, Andrew and Schurr, Hans-J\"{o}rg and Barrett, Clark and Tinelli, Cesare},
  title =	{{Improving the SMT Proof Reconstruction Pipeline in Isabelle/HOL}},
  booktitle =	{16th International Conference on Interactive Theorem Proving (ITP 2025)},
  pages =	{26:1--26:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-396-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{352},
  editor =	{Forster, Yannick and Keller, Chantal},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2025.26},
  URN =		{urn:nbn:de:0030-drops-246243},
  doi =		{10.4230/LIPIcs.ITP.2025.26},
  annote =	{Keywords: interactive theorem proving, proof assistants, Isabelle/HOL, SMT, certification, proof certificates, proof reconstruction, proof automation}
}

Document

DOI: 10.4230/OASIcs.SpaceCHI.2025.15

Integrating Human-In-The-Loop AI to Tackle Space Communication Delay Challenges

Authors: Nikos Mavrakis, Effie Lai-Chong Law, and Hubert P. H. Shum

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)

Abstract

Deep space missions face significant communication delays that disrupt both operational workflows and psychological support for crew members. Unlike low Earth orbit operations, delays ranging from several minutes to nearly an hour make real-time communication with mission control infeasible, forcing crews to act with greater independence under uncertain conditions. This position paper examines how human-in-the-loop AI, digital twins, and edge AI can be integrated to mitigate these delays while maintaining astronaut autonomy and engagement. We argue that human-in-the-loop AI enables decision-making processes that are responsive to local context while remaining adaptable to changing mission demands. Digital twins offer real-time simulation and predictive modelling capabilities, allowing astronauts to explore options and troubleshoot without waiting for ground input. Edge AI brings computation closer to data sources, enabling low-latency inference onboard spacecraft for time-critical decisions. These ideas are explored through two use cases: using deepfakes to support emotionally resonant communication with loved ones, and applying visual-language models for onboard fault diagnosis and adaptive task replanning. We conclude with reflections on system design challenges under constrained and high-stakes conditions.

Cite as

Nikos Mavrakis, Effie Lai-Chong Law, and Hubert P. H. Shum. Integrating Human-In-The-Loop AI to Tackle Space Communication Delay Challenges. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 15:1-15:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{mavrakis_et_al:OASIcs.SpaceCHI.2025.15,
  author =	{Mavrakis, Nikos and Law, Effie Lai-Chong and Shum, Hubert P. H.},
  title =	{{Integrating Human-In-The-Loop AI to Tackle Space Communication Delay Challenges}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{15:1--15:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.15},
  URN =		{urn:nbn:de:0030-drops-240051},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.15},
  annote =	{Keywords: Human-in-the-loop AI, communication delays, human spaceflight}
}

Document

DOI: 10.4230/LIPIcs.GIScience.2025.13

Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation

Authors: Eftychia Koukouraki, Auriol Degbelo, and Christian Kray

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)

Abstract

Reproducibility is a key principle of the modern scientific method. Maps, as an important means of communicating scientific results in GIScience and across disciplines, should be reproducible. Currently, map reproducibility assessment is done manually, which makes the assessment process tedious and time-consuming, ultimately limiting its efficiency. Hence, this work explores the extent to which Visual Question-Answering (VQA) can be used to automate some tasks relevant to map reproducibility assessment. We selected five state-of-the-art vision language models (VLMs) and followed a three-step approach to evaluate their ability to discriminate between maps and other images, interpret map content, and compare two map images using VQA. Our results show that current VLMs already possess map-reading capabilities and demonstrate understanding of spatial concepts, such as cardinal directions, geographic scope, and legend interpretation. Our paper demonstrates the potential of using VQA to support reproducibility assessment and highlights the outstanding issues that need to be addressed to achieve accurate, trustworthy map descriptions, thereby reducing the time and effort required by human evaluators.

Cite as

Eftychia Koukouraki, Auriol Degbelo, and Christian Kray. Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 13:1-13:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{koukouraki_et_al:LIPIcs.GIScience.2025.13,
  author =	{Koukouraki, Eftychia and Degbelo, Auriol and Kray, Christian},
  title =	{{Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{13:1--13:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.13},
  URN =		{urn:nbn:de:0030-drops-238426},
  doi =		{10.4230/LIPIcs.GIScience.2025.13},
  annote =	{Keywords: map comparison, computational reproducibility, visual question answering, large language models, GeoAI}
}

@InProceedings{koukouraki_et_al:LIPIcs.GIScience.2025.13,
  author =	{Koukouraki, Eftychia and Degbelo, Auriol and Kray, Christian},
  title =	{{Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{13:1--13:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.13},
  URN =		{urn:nbn:de:0030-drops-238426},
  doi =		{10.4230/LIPIcs.GIScience.2025.13},
  annote =	{Keywords: map comparison, computational reproducibility, visual question answering, large language models, GeoAI}
}

Document

DOI: 10.4230/OASIcs.Manzini.7

Algorithms for Computing Very Large BWTs: a Short Survey

Authors: Diego Díaz-Domínguez, Lavinia Egidi, Veronica Guerrini, Felipe A. Louza, and Giovanna Rosone

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)

Abstract

The Burrows-Wheeler Transform (BWT) is a fundamental string transformation that, although initially introduced for data compression, has been extensively utilized across various domains, including text indexing and pattern matching within large datasets. Although the BWT construction is linear, the constants make the task impractical for large datasets, and as highlighted by Ferragina et al. [Paolo Ferragina et al., 2012], "to use it, one must first build it!". Thus, the construction of the BWT remains a significant challenge. For these reasons, during the past three decades there has been a succession of new algorithms for its construction using techniques that work in external memory or that use text compression. In this survey, we revise some of the most important advancements and tools presented in the past years for computing large BWTs exploiting external memory or text compression approaches without using additional information about the data.

Cite as

Diego Díaz-Domínguez, Lavinia Egidi, Veronica Guerrini, Felipe A. Louza, and Giovanna Rosone. Algorithms for Computing Very Large BWTs: a Short Survey. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 7:1-7:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{diazdominguez_et_al:OASIcs.Manzini.7,
  author =	{D{\'\i}az-Dom{\'\i}nguez, Diego and Egidi, Lavinia and Guerrini, Veronica and Louza, Felipe A. and Rosone, Giovanna},
  title =	{{Algorithms for Computing Very Large BWTs: a Short Survey}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{7:1--7:28},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.7},
  URN =		{urn:nbn:de:0030-drops-239151},
  doi =		{10.4230/OASIcs.Manzini.7},
  annote =	{Keywords: Burrows-Wheeler transform, Extended Burrows-Wheeler transform, external memory, text compression, longest common prefix}
}

@InProceedings{diazdominguez_et_al:OASIcs.Manzini.7,
  author =	{D{\'\i}az-Dom{\'\i}nguez, Diego and Egidi, Lavinia and Guerrini, Veronica and Louza, Felipe A. and Rosone, Giovanna},
  title =	{{Algorithms for Computing Very Large BWTs: a Short Survey}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{7:1--7:28},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.7},
  URN =		{urn:nbn:de:0030-drops-239151},
  doi =		{10.4230/OASIcs.Manzini.7},
  annote =	{Keywords: Burrows-Wheeler transform, Extended Burrows-Wheeler transform, external memory, text compression, longest common prefix}
}

Document

Survey

DOI: 10.4230/TGDK.3.1.3

Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1

Abstract

Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}

24 Search Results for "Long, Sheng"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message