DROPS

Document

DOI: 10.4230/LIPIcs.STACS.2026.28

Conditional Complexity Hardness: Monotone Circuit Size, Matrix Rigidity, and Tensor Rank

Authors: Nikolai Chukhin, Alexander S. Kulikov, Ivan Mihajlin, and Arina Smirnova

Published in: LIPIcs, Volume 364, 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)

Abstract

Proving complexity lower bounds remains a challenging task: currently, we only know how to prove conditional uniform (algorithm) lower bounds and nonuniform (circuit) lower bounds in restricted circuit models. About a decade ago, Williams (STOC 2010) showed how to derive nonuniform lower bounds from uniform upper bounds: roughly, by designing a fast algorithm for checking satisfiability of circuits, one gets a lower bound for this circuit class. Since then, a number of results of this kind have been proved. For example, Jahanjou et al. (ICALP 2015) and Carmosino et al. (ITCS 2016) proved that if NSETH fails, then E^{NP} has series-parallel circuit size ω(n). One can also derive nonuniform lower bounds from nondeterministic uniform lower bounds. Perhaps the most well-known example is the Karp-Lipton theorem (STOC 1980): if Σ₂ ≠ Π₂, then NP ⊄ P/poly. Some recent examples include the following. Nederlof (STOC 2020) proved a lower bound on the matrix multiplication tensor rank under an assumption that TSP cannot be solved faster than in 2ⁿ time. Belova et al. (SODA 2024) proved that there exists an explicit polynomial family of arithmetic circuit size Ω(n^{δ}), for any δ > 0, assuming that MAX-3-SAT cannot be solved faster than in 2ⁿ nondeterministic time. Williams (FOCS 2024) proved an exponential lower bound for ETHR ∘ ETHR circuits under the Orthogonal Vectors conjecture. Whereas all the lower bounds above are proved under strong assumptions that might eventually be refuted, the revealed connections are of great interest and may still give further insights: one may be able to weaken the used assumptions or to construct generators from other fine-grained reductions. In this paper, we continue developing this line of research and show how uniform nondeterministic lower bounds can be used to construct generators of various types of combinatorial objects that are notoriously hard to analyze: Boolean functions of high circuit size, matrices of high rigidity, and tensors of high rank. Specifically, we prove the following. - If, for some ε and k, k-SAT cannot be solved in input-oblivious co-nondeterministic time O(2^{(1/2+ε)n}), then there exists a monotone Boolean function family in coNP of monotone circuit size 2^{Ω(n / log n)}. Combining this with the result above, we get win-win circuit lower bounds: either E^{NP{}} requires series-parallel circuits of size ω(n) or coNP requires monotone circuits of size 2^{Ω(n / log n)}. - If, for all ε > 0, MAX-3-SAT cannot be solved in co-nondeterministic time O(2^{(1 - ε)n}), then there exist small families of matrices with rigidity exceeding the best known constructions as well as small families of three-dimensional tensors of rank n^{1+Δ}, for some Δ > 0.

Cite as

Nikolai Chukhin, Alexander S. Kulikov, Ivan Mihajlin, and Arina Smirnova. Conditional Complexity Hardness: Monotone Circuit Size, Matrix Rigidity, and Tensor Rank. In 43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 364, pp. 28:1-28:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{chukhin_et_al:LIPIcs.STACS.2026.28,
  author =	{Chukhin, Nikolai and Kulikov, Alexander S. and Mihajlin, Ivan and Smirnova, Arina},
  title =	{{Conditional Complexity Hardness: Monotone Circuit Size, Matrix Rigidity, and Tensor Rank}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{28:1--28:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.28},
  URN =		{urn:nbn:de:0030-drops-255177},
  doi =		{10.4230/LIPIcs.STACS.2026.28},
  annote =	{Keywords: computational complexity, circuit complexity, lower bounds, conditional lower bounds, monotone circuits, matrix rigidity, tensor rank, arithmetic circuits, fine-grained complexity}
}

@InProceedings{chukhin_et_al:LIPIcs.STACS.2026.28,
  author =	{Chukhin, Nikolai and Kulikov, Alexander S. and Mihajlin, Ivan and Smirnova, Arina},
  title =	{{Conditional Complexity Hardness: Monotone Circuit Size, Matrix Rigidity, and Tensor Rank}},
  booktitle =	{43rd International Symposium on Theoretical Aspects of Computer Science (STACS 2026)},
  pages =	{28:1--28:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-412-3},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{364},
  editor =	{Mahajan, Meena and Manea, Florin and McIver, Annabelle and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2026.28},
  URN =		{urn:nbn:de:0030-drops-255177},
  doi =		{10.4230/LIPIcs.STACS.2026.28},
  annote =	{Keywords: computational complexity, circuit complexity, lower bounds, conditional lower bounds, monotone circuits, matrix rigidity, tensor rank, arithmetic circuits, fine-grained complexity}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2026.104

Dimension-Free Correlated Sampling for the Hypersimplex

Authors: Joseph (Seffi) Naor, Nitya Raju, Abhishek Shetty, Aravind Srinivasan, Renata Valieva, and David Wajc

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)

Abstract

Sampling from multiple distributions so as to maximize overlap has been studied by statisticians since the 1950s. Since the 2000s, such correlated sampling from the probability simplex has been a powerful building block in disparate areas of theoretical computer science. We study a generalization of this problem to sampling sets from given vectors in the hypersimplex, i.e., outputting sets of size (at most) k ∈ [n], while maximizing the overlap of the sampled sets. Specifically, the expected difference between two output sets should be at most α times their input vectors' 𝓁₁ distance. A value of α = O(log n) is known to be achievable, due to Chen et al. (ICALP'17). We improve this factor to O(log k), independent of the ambient dimension n. Our algorithm satisfies other desirable properties, including (up to a log^* n factor) input-sparsity sampling time, logarithmic parallel depth and dynamic update time, as well as preservation of submodular objectives. Anticipating broader use of correlated sampling algorithms for the hypersimplex, we present applications of our algorithm to online paging, offline approximation of metric multi-labeling, and swift multi-scenario submodular welfare approximating reallocation.

Cite as

Joseph (Seffi) Naor, Nitya Raju, Abhishek Shetty, Aravind Srinivasan, Renata Valieva, and David Wajc. Dimension-Free Correlated Sampling for the Hypersimplex. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 104:1-104:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@InProceedings{naor_et_al:LIPIcs.ITCS.2026.104,
  author =	{Naor, Joseph (Seffi) and Raju, Nitya and Shetty, Abhishek and Srinivasan, Aravind and Valieva, Renata and Wajc, David},
  title =	{{Dimension-Free Correlated Sampling for the Hypersimplex}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{104:1--104:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.104},
  URN =		{urn:nbn:de:0030-drops-253918},
  doi =		{10.4230/LIPIcs.ITCS.2026.104},
  annote =	{Keywords: Correlated Rounding, Dependent Rounding}
}

Document

Survey

DOI: 10.4230/TGDK.3.2.1

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

Research

DOI: 10.4230/TGDK.3.2.3

GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.95

The Planted Orthogonal Vectors Problem

Authors: David Kühnemann, Adam Polak, and Alon Rosen

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

In the k-Orthogonal Vectors (k-OV) problem we are given k sets, each containing n binary vectors of dimension d = n^o(1), and our goal is to pick one vector from each set so that at each coordinate at least one vector has a zero. It is a central problem in fine-grained complexity, conjectured to require n^{k-o(1)} time in the worst case. We propose a way to plant a solution among vectors with i.i.d. p-biased entries, for appropriately chosen p, so that the planted solution is the unique one. Our conjecture is that the resulting k-OV instances still require time n^{k-o(1)} to solve, on average. Our planted distribution has the property that any subset of strictly less than k vectors has the same marginal distribution as in the model distribution, consisting of i.i.d. p-biased random vectors. We use this property to give average-case search-to-decision reductions for k-OV.

Cite as

David Kühnemann, Adam Polak, and Alon Rosen. The Planted Orthogonal Vectors Problem. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 95:1-95:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kuhnemann_et_al:LIPIcs.ESA.2025.95,
  author =	{K\"{u}hnemann, David and Polak, Adam and Rosen, Alon},
  title =	{{The Planted Orthogonal Vectors Problem}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{95:1--95:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.95},
  URN =		{urn:nbn:de:0030-drops-245640},
  doi =		{10.4230/LIPIcs.ESA.2025.95},
  annote =	{Keywords: Average-case complexity, fine-grained complexity, orthogonal vectors}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.111

Hardness of Median and Center in the Ulam Metric

Authors: Nick Fischer, Elazar Goldenberg, Mursalin Habib, and Karthik C. S.

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

The classical rank aggregation problem seeks to combine a set X of n permutations into a single representative "consensus" permutation. In this paper, we investigate two fundamental rank aggregation tasks under the well-studied Ulam metric: computing a median permutation (which minimizes the sum of Ulam distances to X) and computing a center permutation (which minimizes the maximum Ulam distance to X) in two settings. - Continuous Setting: In the continuous setting, the median/center is allowed to be any permutation. It is known that computing a center in the Ulam metric is NP-hard and we add to this by showing that computing a median is NP-hard as well via a simple reduction from the Max-Cut problem. While this result may not be unexpected, it had remained elusive until now and confirms a speculation by Chakraborty, Das, and Krauthgamer [SODA '21]. - Discrete Setting: In the discrete setting, the median/center must be a permutation from the input set. We fully resolve the fine-grained complexity of the discrete median and discrete center problems under the Ulam metric, proving that the naive Õ(n² L)-time algorithm (where L is the length of the permutation) is conditionally optimal. This resolves an open problem raised by Abboud, Bateni, Cohen-Addad, Karthik C. S., and Seddighin [APPROX '23]. Our reductions are inspired by the known fine-grained lower bounds for similarity measures, but we face and overcome several new highly technical challenges.

Cite as

Nick Fischer, Elazar Goldenberg, Mursalin Habib, and Karthik C. S.. Hardness of Median and Center in the Ulam Metric. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 111:1-111:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{fischer_et_al:LIPIcs.ESA.2025.111,
  author =	{Fischer, Nick and Goldenberg, Elazar and Habib, Mursalin and Karthik C. S.},
  title =	{{Hardness of Median and Center in the Ulam Metric}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{111:1--111:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.111},
  URN =		{urn:nbn:de:0030-drops-245809},
  doi =		{10.4230/LIPIcs.ESA.2025.111},
  annote =	{Keywords: Ulam distance, median, center, rank aggregation, fine-grained complexity}
}

Document

DOI: 10.4230/LIPIcs.GIScience.2025.18

U-Prithvi: Integrating a Foundation Model and U-Net for Enhanced Flood Inundation Mapping

Authors: Vit Kostejn, Yamil Essus, Jenna Abrahamson, and Ranga Raju Vatsavai

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)

Abstract

In recent years, large pre-trained models, commonly referred to as foundation models, have become increasingly popular for various tasks leveraging transfer learning. This trend has expanded to remote sensing, where transformer-based foundation models such as Prithvi, msGFM, and SatSwinMAE have been utilized for a range of applications. While these transformer-based models, particularly the Prithvi model, exhibit strong generalization capabilities, they have limitations on capturing fine-grained details compared to convolutional neural network architectures like U-Net in segmentation tasks. In this paper, we propose a novel architecture, U-Prithvi, which combines the strengths of the Prithvi transformer with those of U-Net. We introduce a RandomHalfMaskLayer to ensure balanced learning from both models during training. Our approach is evaluated on the Sen1Floods11 dataset for flood inundation mapping, and experimental results demonstrate better performance of U-Prithvi over both individual models, achieving improved performance on out-of-sample data. While this principle is illustrated using the Prithvi model, it is easily adaptable to other foundation models.

Cite as

Vit Kostejn, Yamil Essus, Jenna Abrahamson, and Ranga Raju Vatsavai. U-Prithvi: Integrating a Foundation Model and U-Net for Enhanced Flood Inundation Mapping. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 18:1-18:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{kostejn_et_al:LIPIcs.GIScience.2025.18,
  author =	{Kostejn, Vit and Essus, Yamil and Abrahamson, Jenna and Vatsavai, Ranga Raju},
  title =	{{U-Prithvi: Integrating a Foundation Model and U-Net for Enhanced Flood Inundation Mapping}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{18:1--18:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.18},
  URN =		{urn:nbn:de:0030-drops-238479},
  doi =		{10.4230/LIPIcs.GIScience.2025.18},
  annote =	{Keywords: GeoAI, flood mapping, foundation model, U-Net, Prithvi}
}

@InProceedings{kostejn_et_al:LIPIcs.GIScience.2025.18,
  author =	{Kostejn, Vit and Essus, Yamil and Abrahamson, Jenna and Vatsavai, Ranga Raju},
  title =	{{U-Prithvi: Integrating a Foundation Model and U-Net for Enhanced Flood Inundation Mapping}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{18:1--18:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.18},
  URN =		{urn:nbn:de:0030-drops-238479},
  doi =		{10.4230/LIPIcs.GIScience.2025.18},
  annote =	{Keywords: GeoAI, flood mapping, foundation model, U-Net, Prithvi}
}

Document

DOI: 10.4230/LIPIcs.GIScience.2025.9

Identifying Resilient Communities in Road Networks: A Path-Based Embedding Approach

Authors: Christopher Wagner, Somayeh Dodge, and Danial Alizadeh

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)

Abstract

Effective resilience analysis of road networks is fundamental to building sustainable and disaster prepared cities. Identifying which road segments share similar vulnerabilities is important for pinpointing high-risk areas within the network and implementing measures to safeguard them against future disruptions. Graph-based community detection can be applied to group together areas of the network sharing similar structural vulnerabilities. However, current graph-based community detection methods either struggle with integrating node features during partitioning or do not account for the path-based dependencies in road networks. This paper introduces the Path-based Community Embedding (PCE) model, an approach that leverages path-based embeddings to overcome these limitations. PCE combines the strengths of graph attention networks and Long Short-Term Memory models (LSTMs) to learn representations that incorporate both local neighborhood information and long-range path dependencies. Our results on the Santa Barbara road network show that PCE improves community detection performance for resilience analysis, thus offering a powerful tool for urban planners and transportation engineers to preemptively identify vulnerabilities in road networks.

Cite as

Christopher Wagner, Somayeh Dodge, and Danial Alizadeh. Identifying Resilient Communities in Road Networks: A Path-Based Embedding Approach. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 9:1-9:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{wagner_et_al:LIPIcs.GIScience.2025.9,
  author =	{Wagner, Christopher and Dodge, Somayeh and Alizadeh, Danial},
  title =	{{Identifying Resilient Communities in Road Networks: A Path-Based Embedding Approach}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{9:1--9:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.9},
  URN =		{urn:nbn:de:0030-drops-238380},
  doi =		{10.4230/LIPIcs.GIScience.2025.9},
  annote =	{Keywords: road networks, resilience analysis, machine learning, graph neural networks}
}

Document

DOI: 10.4230/LIPIcs.GIScience.2025.8

BERT4Traj: Transformer-Based Trajectory Reconstruction for Sparse Mobility Data

Authors: Hao Yang, Angela Yao, Christopher C. Whalen, and Gengchen Mai

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)

Abstract

Understanding human mobility is essential for applications in public health, transportation, and urban planning. However, mobility data often suffers from sparsity due to limitations in data collection methods, such as infrequent GPS sampling or call detail record (CDR) data that only capture locations during communication events. To address this challenge, we propose BERT4Traj, a transformer-based model that reconstructs complete mobility trajectories by predicting hidden visits in sparse movement sequences. Inspired by BERT’s masked language modeling objective and self-attention mechanisms, BERT4Traj leverages spatial embeddings, temporal embeddings, and contextual background features such as demographics and anchor points. We evaluate BERT4Traj on real-world CDR and GPS datasets collected in Kampala, Uganda, demonstrating that our approach significantly outperforms traditional models such as Markov Chains, KNN, RNNs, and LSTMs. Our results show that BERT4Traj effectively reconstructs detailed and continuous mobility trajectories, enhancing insights into human movement patterns.

Cite as

Hao Yang, Angela Yao, Christopher C. Whalen, and Gengchen Mai. BERT4Traj: Transformer-Based Trajectory Reconstruction for Sparse Mobility Data. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 8:1-8:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{yang_et_al:LIPIcs.GIScience.2025.8,
  author =	{Yang, Hao and Yao, Angela and Whalen, Christopher C. and Mai, Gengchen},
  title =	{{BERT4Traj: Transformer-Based Trajectory Reconstruction for Sparse Mobility Data}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{8:1--8:9},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.8},
  URN =		{urn:nbn:de:0030-drops-238373},
  doi =		{10.4230/LIPIcs.GIScience.2025.8},
  annote =	{Keywords: Human Mobility, Trajectory Reconstruction, Deep Learning, CDR, GPS}
}

Document

DOI: 10.4230/LIPIcs.SAT.2025.30

Bridging Language Models and Symbolic Solvers via the Model Context Protocol

Authors: Stefan Szeider

Published in: LIPIcs, Volume 341, 28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)

Abstract

This paper presents the MCP Solver, a system that bridges large language models with symbolic solvers through the Model Context Protocol (MCP). The system includes a server and a client component. The server provides an interface to constraint programming (via MiniZinc Python), propositional satisfiability and maximum satisfiability (both via PySAT), and SAT modulo Theories (via Python Z3). The client contains an agent that connects to the server via MCP and uses a language model to autonomously translate problem statements (given in English) into encodings through an incremental editing process and runs the solver. Our experiments demonstrate that this neurosymbolic integration effectively combines the natural language understanding of language models with robust solving capabilities across multiple solving paradigms.

Cite as

Stefan Szeider. Bridging Language Models and Symbolic Solvers via the Model Context Protocol. In 28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 341, pp. 30:1-30:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{szeider:LIPIcs.SAT.2025.30,
  author =	{Szeider, Stefan},
  title =	{{Bridging Language Models and Symbolic Solvers via the Model Context Protocol}},
  booktitle =	{28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)},
  pages =	{30:1--30:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-381-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{341},
  editor =	{Berg, Jeremias and Nordstr\"{o}m, Jakob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SAT.2025.30},
  URN =		{urn:nbn:de:0030-drops-237649},
  doi =		{10.4230/LIPIcs.SAT.2025.30},
  annote =	{Keywords: Large Language Models, Agents, Constraint Programming, Satisfiability Solvers, Maximum Satisfiability, SAT Modulo Theories, Model Context Protocol}
}

Document

DOI: 10.4230/LIPIcs.CCC.2025.31

Towards Free Lunch Derandomization from Necessary Assumptions (And OWFs)

Authors: Marshall Ball, Lijie Chen, and Roei Tell

Published in: LIPIcs, Volume 339, 40th Computational Complexity Conference (CCC 2025)

Abstract

The question of optimal derandomization, introduced by Doron et. al (JACM 2022), garnered significant recent attention. Works in recent years showed conditional superfast derandomization algorithms, as well as conditional impossibility results, and barriers for obtaining superfast derandomization using certain black-box techniques. Of particular interest is the extreme high-end, which focuses on "free lunch" derandomization, as suggested by Chen and Tell (FOCS 2021). This is derandomization that incurs essentially no time overhead, and errs only on inputs that are infeasible to find. Constructing such algorithms is challenging, and so far there have not been any results following the one in their initial work. In their result, their algorithm is essentially the classical Nisan-Wigderson generator, and they relied on an ad-hoc assumption asserting the existence of a function that is non-batch-computable over all polynomial-time samplable distributions. In this work we deduce free lunch derandomization from a variety of natural hardness assumptions. In particular, we do not resort to non-batch-computability, and the common denominator for all of our assumptions is hardness over all polynomial-time samplable distributions, which is necessary for the conclusion. The main technical components in our proofs are constructions of new and superfast targeted generators, which completely eliminate the time overheads that are inherent to all previously known constructions. In particular, we present an alternative construction for the targeted generator by Chen and Tell (FOCS 2021), which is faster than the original construction, and also more natural and technically intuitive. These contributions significantly strengthen the evidence for the possibility of free lunch derandomization, distill the required assumptions for such a result, and provide the first set of dedicated technical tools that are useful for studying the question.

Cite as

Marshall Ball, Lijie Chen, and Roei Tell. Towards Free Lunch Derandomization from Necessary Assumptions (And OWFs). In 40th Computational Complexity Conference (CCC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 339, pp. 31:1-31:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ball_et_al:LIPIcs.CCC.2025.31,
  author =	{Ball, Marshall and Chen, Lijie and Tell, Roei},
  title =	{{Towards Free Lunch Derandomization from Necessary Assumptions (And OWFs)}},
  booktitle =	{40th Computational Complexity Conference (CCC 2025)},
  pages =	{31:1--31:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-379-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{339},
  editor =	{Srinivasan, Srikanth},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2025.31},
  URN =		{urn:nbn:de:0030-drops-237259},
  doi =		{10.4230/LIPIcs.CCC.2025.31},
  annote =	{Keywords: Pseudorandomness, Derandomization}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.35

Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation

Authors: Sangyeop Yeo, Seung-Won Hwang, and Yu-Seung Ma

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

The use of Large Language Models (LLMs) for code generation has gained significant attention in recent years. Existing methods often aim to improve the quality of generated code by incorporating additional contextual information or guidance into input prompts. Many of these approaches adopt process-oriented reasoning strategies, mimicking human-like step-by-step thinking; however, they may not always align with the structured nature of programming languages. This paper introduces Chain of Grounded Objectives (CGO), a concise goal-oriented prompting approach that embeds functional objectives into prompts to enhance code generation. By focusing on precisely defined objectives rather than explicit procedural steps, CGO aligns more naturally with programming tasks while retaining flexibility. Empirical evaluations on HumanEval, MBPP, their extended versions, and LiveCodeBench show that CGO achieves accuracy comparable to or better than existing methods while using fewer tokens, making it a more efficient approach to LLM-based code generation.

Cite as

Sangyeop Yeo, Seung-Won Hwang, and Yu-Seung Ma. Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 35:1-35:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{yeo_et_al:LIPIcs.ECOOP.2025.35,
  author =	{Yeo, Sangyeop and Hwang, Seung-Won and Ma, Yu-Seung},
  title =	{{Chain of Grounded Objectives: Concise Goal-Oriented Prompting for Code Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{35:1--35:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.35},
  URN =		{urn:nbn:de:0030-drops-233271},
  doi =		{10.4230/LIPIcs.ECOOP.2025.35},
  annote =	{Keywords: Artificial Intelligence, Natural Language Processing, Prompt Design, Large Language Models, Code Generation}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.13

FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation

Authors: Amber Gorzynski and Alastair F. Donaldson

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

Decompilation is the process of translating compiled code into high-level code. Control flow recovery is a challenging part of the process. "Misdecompilations" can occur, whereby the decompiled code does not accurately represent the semantics of the compiled code, despite it being syntactically valid. This is problematic because it can mislead users who are trying to reason about the program. We present CFG-based program generation: a novel approach to randomised testing that aims to improve the control flow recovery of decompilers. CFG-based program generation involves randomly generating control flow graphs (CFGs) and paths through each graph. Inspired by prior work in the domain of GPU computing, (CFG, path) pairs are "fleshed" into test programs. Each program is decompiled and recompiled. The test oracle verifies whether the actual runtime path through the graph matches the expected path. Any difference in the execution paths after recompilation indicates a possible misdecompilation. A key benefit of this approach is that it is largely independent of the source and target languages in question because it is focused on control flow. The approach is therefore applicable to numerous decompilation settings. The trade-off resulting from the focus on control flow is that misdecompilation bugs that do not relate to control flow (e.g. bugs that involve specific arithmetic operations) are out of scope. We have implemented this approach in FuzzFlesh, an open-source randomised testing tool. FuzzFlesh can be easily configured to target a variety of low-level languages and decompiler toolchains because most of the CFG and path generation process is language-independent. At present, FuzzFlesh supports testing decompilation of Java bytecode, .NET assembly and x86 machine code. In addition to program generation, FuzzFlesh also includes an automated test-case reducer that operates on the CFG rather than the low-level program, which means that it can be applied to any of the target languages. We present a large experimental campaign applying FuzzFlesh to a variety of decompilers, leading to the discovery of 12 previously-unknown bugs across two language formats, six of which have been fixed. We present experiments comparing our generic FuzzFlesh tool to two state-of-the-art decompiler testing tools targeted at specific languages. As expected, the coverage our generic FuzzFlesh tool achieves on a given decompiler is lower than the coverage achieved by a tool specifically designed for the input format of that decompiler. However, due to its focus on control flow, FuzzFlesh is able to cover sections of control flow recovery code that the targeted tools cannot reach, and identify control flow related bugs that the targeted tools miss.

Cite as

Amber Gorzynski and Alastair F. Donaldson. FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 13:1-13:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gorzynski_et_al:LIPIcs.ECOOP.2025.13,
  author =	{Gorzynski, Amber and Donaldson, Alastair F.},
  title =	{{FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{13:1--13:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.13},
  URN =		{urn:nbn:de:0030-drops-233062},
  doi =		{10.4230/LIPIcs.ECOOP.2025.13},
  annote =	{Keywords: Decompiler, Reverse Engineering, Control Flow, Software Testing, Fuzzing}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.6

Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories

Authors: Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

Functionality-specific vulnerabilities, which mainly occur in Application Programming Interfaces (APIs) with specific functionalities, are crucial for software developers to detect and avoid. When detecting individual functionality-specific vulnerabilities, the existing two categories of approaches are ineffective because they consider only the API bodies and are unable to handle diverse implementations of functionality-equivalent APIs. To effectively detect functionality-specific vulnerabilities, we propose APISS, the first approach to utilize API doc strings and signatures instead of API bodies. APISS first retrieves functionality-equivalent APIs for APIs with existing vulnerabilities and then migrates Proof-of-Concepts (PoCs) of the existing vulnerabilities for newly detected vulnerable APIs. To retrieve functionality-equivalent APIs, we leverage a Large Language Model for API embedding to improve the accuracy and address the effectiveness and scalability issues suffered by the existing approaches. To migrate PoCs of the existing vulnerabilities for newly detected vulnerable APIs, we design a semi-automatic schema to substantially reduce manual costs. We conduct a comprehensive evaluation to empirically compare APISS with four state-of-the-art approaches of detecting vulnerabilities and two state-of-the-art approaches of retrieving functionality-equivalent APIs. The evaluation subjects include 180 widely used Java repositories using 10 existing vulnerabilities, along with their PoCs. The results show that APISS effectively retrieves functionality-equivalent APIs, achieving a Top-1 Accuracy of 0.81 while the best of the baselines under comparison achieves only 0.55. APISS is highly efficient: the manual costs are within 10 minutes per vulnerability and the end-to-end runtime overhead of testing one candidate API is less than 2 hours. APISS detects 179 new vulnerabilities and receives 60 new CVE IDs, bringing high value to security practice.

Cite as

Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie. Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 6:1-6:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ECOOP.2025.6,
  author =	{Chen, Tianyu and Wang, Zeyu and Li, Lin and Li, Ding and Li, Zongyang and Chang, Xiaoning and Bian, Pan and Liang, Guangtai and Wang, Qianxiang and Xie, Tao},
  title =	{{Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.6},
  URN =		{urn:nbn:de:0030-drops-232999},
  doi =		{10.4230/LIPIcs.ECOOP.2025.6},
  annote =	{Keywords: Application Security, Vulnerability Detection, Large Language Model}
}

@InProceedings{chen_et_al:LIPIcs.ECOOP.2025.6,
  author =	{Chen, Tianyu and Wang, Zeyu and Li, Lin and Li, Ding and Li, Zongyang and Chang, Xiaoning and Bian, Pan and Liang, Guangtai and Wang, Qianxiang and Xie, Tao},
  title =	{{Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.6},
  URN =		{urn:nbn:de:0030-drops-232999},
  doi =		{10.4230/LIPIcs.ECOOP.2025.6},
  annote =	{Keywords: Application Security, Vulnerability Detection, Large Language Model}
}

Document

Survey

DOI: 10.4230/TGDK.3.1.3

Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1

Abstract

Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}

25 Search Results for "Gao, Jiawei"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message