DROPS

Document

Research

Native Provenance Computation for Federated and Non-Federated SPARQL Queries

Authors: Zubaria Asma, Daniel Hernández, Luis Galárraga, Giorgos Flouris, Irini Fundulaki, and Katja Hose

Published in: TGDK, Volume 4, Issue 1 (2026). Transactions on Graph Data and Knowledge, Volume 4, Issue 1

Abstract

The popularity of knowledge graphs (KGs) owes credit to their flexible data model, which is suitable for data integration from multiple sources. Several KG-based applications, such as trust assessment, view maintenance, or data valuation on dynamic data, rely on the ability to compute provenance explanations for query results. This need becomes more urgent in federated query processing systems, which allow the online consumption of heterogeneous and decentralized Web data. However, the problem of computing and interacting with provenance has received little attention, especially in the federated setting. On those grounds, this paper introduces the NPCS (Native Provenance Computation for SPARQL) approach, and its federated variant Fed-NPCS, that compute provenance for SPARQL query results. Both approaches build upon spm-semirings to annotate the results of monotonic and non-monotonic SPARQL queries with their provenance. Due to their reliance on query rewriting techniques, the approaches are directly applicable to already deployed SPARQL engines and federations using different reification schemes, including RDF-star. Our experimental evaluation shows that our novel query rewriting approach brings significant run-time improvements w.r.t. the state-of-the-art across both centralized and federated settings. In centralized settings, our tests on two popular SPARQL engines (GraphDB and Stardog) reveal substantial runtime gains over existing query rewriting solutions, enabling scalability to RDF graphs with billions of triples. In federated settings, our experiments on the FedShop benchmark with GraphDB show the viability of Fed-NPCS for federations with up to 200 sources.

Cite as

Zubaria Asma, Daniel Hernández, Luis Galárraga, Giorgos Flouris, Irini Fundulaki, and Katja Hose. Native Provenance Computation for Federated and Non-Federated SPARQL Queries. In Transactions on Graph Data and Knowledge (TGDK), Volume 4, Issue 1, pp. 4:1-4:43, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@Article{asma_et_al:TGDK.4.1.4,
  author =	{Asma, Zubaria and Hern\'{a}ndez, Daniel and Gal\'{a}rraga, Luis and Flouris, Giorgos and Fundulaki, Irini and Hose, Katja},
  title =	{{Native Provenance Computation for Federated and Non-Federated SPARQL Queries}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:43},
  ISSN =	{2942-7517},
  year =	{2026},
  volume =	{4},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.4.1.4},
  URN =		{urn:nbn:de:0030-drops-259642},
  doi =		{10.4230/TGDK.4.1.4},
  annote =	{Keywords: native provenance computation, federated SPARQL queries, data provenance, NPCS, Fed-NPCS}
}

Document

Research

DOI: 10.4230/TGDK.4.1.1

On the Computational Cost of Knowledge Graph Embeddings

Authors: Victor Charpenay, Mansour Zoubeirou A Mayaki, and Antoine Zimmermann

Published in: TGDK, Volume 4, Issue 1 (2026). Transactions on Graph Data and Knowledge, Volume 4, Issue 1

Abstract

Over a decade, numerous Knowledge Graph Embedding (KGE) models have been designed and evaluated on reference datasets, always with increasing performance. In this paper, we re-evaluate these models with respect to their computational efficiency during training, by estimating the computational cost of the procedure expressed in floating-point operations. We design a cost model based on analytical expressions and apply it on a collection of 20 KGE models, representative of the state-of-the-art. We show that dimensionality or parameter efficiency, used in the literature to compare models with each other, are not suitable to evaluate the true cost of models. Through fixed-budget experiments, a novel approach to evaluate KGE models based on cost estimates, we re-assess the relative performance of model families compared to the state-of-the-art. Bilinear models such as ComplEx underperform with a low computational budget while hyperbolic linear models appear to offer no particular benefit compared to simpler Euclidian models, especially the MuRE model. Neural models, such as ConvE or CompGCN, achieve reasonable performance in the literature but their high computational cost appears unnecessary when compared with other models. The trade-off between efficiency and expressivity of both linear and neural models is to be further explored.

Cite as

Victor Charpenay, Mansour Zoubeirou A Mayaki, and Antoine Zimmermann. On the Computational Cost of Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 4, Issue 1, pp. 1:1-1:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@Article{charpenay_et_al:TGDK.4.1.1,
  author =	{Charpenay, Victor and Zoubeirou A Mayaki, Mansour and Zimmermann, Antoine},
  title =	{{On the Computational Cost of Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:30},
  ISSN =	{2942-7517},
  year =	{2026},
  volume =	{4},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.4.1.1},
  URN =		{urn:nbn:de:0030-drops-256863},
  doi =		{10.4230/TGDK.4.1.1},
  annote =	{Keywords: Knowledge Graph Embedding, Parameter Efficiency, Computational Budget, Green AI}
}

Document

Survey

DOI: 10.4230/TGDK.4.1.2

Temporal Modelling in Cultural Heritage Knowledge Graphs: Use Cases, Requirements, Evaluation, and Decision Support

Authors: Oleksandra Bruns, Jörg Waitelonis, Jeff Z. Pan, and Harald Sack

Published in: TGDK, Volume 4, Issue 1 (2026). Transactions on Graph Data and Knowledge, Volume 4, Issue 1

Abstract

Our culture, history and world are in constant motion, continuously shaped by the flow of time, evolving narratives, and shifting relationships. Capturing this temporal complexity within cultural heritage (CH) knowledge graphs is essential for preserving the dynamic nature of human heritage. However, standard RDF predicates fail to effectively model the temporal aspects of cultural data, such as changing facts, evolving relationships, and temporal concepts. Over the past two decades, a variety of RDF-based approaches have been proposed to address this limitation, yet guidance is missing on which method best suits specific CH contexts. This paper presents a systematic evaluation of temporal RDF modelling approaches from a CH perspective. Based on an analysis of real-world CH use cases, core temporal requirements are identified that reflect both modelling expressivity and practical concerns. Six prominent approaches - RDF*, tRDF, Named Graphs, Singleton Property, N-ary Relations, and 4D Fluents - are assessed across these requirements. Our findings reveal that no single solution fits all scenarios, but suitable approaches can be selected based on project-specific priorities. To support practitioners, a decision-support tool is introduced to guide them in selecting the most suitable extension for their specific needs. This work provides practical guidance for CH modelling and contributes to the broader development of temporally aware Linked Data.

Cite as

Oleksandra Bruns, Jörg Waitelonis, Jeff Z. Pan, and Harald Sack. Temporal Modelling in Cultural Heritage Knowledge Graphs: Use Cases, Requirements, Evaluation, and Decision Support. In Transactions on Graph Data and Knowledge (TGDK), Volume 4, Issue 1, pp. 2:1-2:46, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)

Copy BibTex To Clipboard

@Article{bruns_et_al:TGDK.4.1.2,
  author =	{Bruns, Oleksandra and Waitelonis, J\"{o}rg and Pan, Jeff Z. and Sack, Harald},
  title =	{{Temporal Modelling in Cultural Heritage Knowledge Graphs: Use Cases, Requirements, Evaluation, and Decision Support}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:46},
  ISSN =	{2942-7517},
  year =	{2026},
  volume =	{4},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.4.1.2},
  URN =		{urn:nbn:de:0030-drops-256871},
  doi =		{10.4230/TGDK.4.1.2},
  annote =	{Keywords: Temporal Data Representation, RDF Extensions, Cultural Heritage, Knowledge Graphs}
}

Document

Research

DOI: 10.4230/TGDK.3.3.1

A Logic Programming Approach to Repairing SHACL Constraint Violations

Authors: Shqiponja Ahmetaj, Robert David, Axel Polleres, and Mantas Šimkus

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3

Abstract

The Shapes Constraint Language (SHACL) is a recent standard, a W3C recommendation, for validating RDF graphs against shape constraints to be checked on target nodes of a data graph. The standard also describes the notion of validation reports, which detail the results of the validation process. In case of violation of constraints, the validation report should explain the reasons for non-validation, offering guidance on how to identify or fix violations in the data graph. Since the specification left it open to SHACL processors to define such explanations, a recent work proposed the use of explanations in the style of database repairs, where a repair is a set of additions to or deletions from the data graph so that the resulting graph validates against the constraints. In this paper, we study such repairs for non-recursive SHACL, the largest fragment of SHACL that is fully defined in the specification. We propose an algorithm to compute repairs by encoding the explanation problem - using Answer Set Programming (ASP) - into a logic program, where the answer sets contain (minimal) repairs. We then study a scenario where it is not possible to simultaneously repair all the targets, which may be the case due to overall unsatisfiability or conflicting constraints. We introduce a relaxed notion of validation, which allows to validate a (maximal) subset of the targets and adapt the ASP translation to take into account this relaxation. Finally, we add support for repairing constraints which use property paths and equality of paths. Our implementation in clingo is - to the best of our knowledge - the first implementation of a repair program for SHACL.

Cite as

Shqiponja Ahmetaj, Robert David, Axel Polleres, and Mantas Šimkus. A Logic Programming Approach to Repairing SHACL Constraint Violations. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 1:1-1:36, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{ahmetaj_et_al:TGDK.3.3.1,
  author =	{Ahmetaj, Shqiponja and David, Robert and Polleres, Axel and \v{S}imkus, Mantas},
  title =	{{A Logic Programming Approach to Repairing SHACL Constraint Violations}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:36},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.1},
  URN =		{urn:nbn:de:0030-drops-252124},
  doi =		{10.4230/TGDK.3.3.1},
  annote =	{Keywords: SHACL, Shapes Constraint Language, Database Repairs, Knowledge Graphs, Semantic Web, Answer Set Programming}
}

Document

Survey

DOI: 10.4230/TGDK.3.2.1

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

Research

DOI: 10.4230/TGDK.3.2.3

GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}

Document

Research

DOI: 10.4230/TGDK.3.1.4

CoaKG: A Contextualized Knowledge Graph Approach for Exploratory Search and Decision Making

Authors: Veronica dos Santos, Daniel Schwabe, Altigran Soares da Silva, and Sérgio Lifschitz

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1

Abstract

In decision-making scenarios, an information need arises due to a knowledge gap when a decision-maker needs more knowledge to make a decision. Users may take the initiative to acquire knowledge to fill this gap through exploratory search approaches using Knowledge Graphs (KGs) as information sources, but their queries can be incomplete, inaccurate, and ambiguous. Although KGs have great potential for exploratory search, they are incomplete by nature. Besides, for both Crowd-sourced KGs and KGs constructed by integrating several different information sources of varying quality to be effectively consumed, there is a need for a Trust Layer. Our research aims to enrich and allow querying KGs to support context-aware exploration in decision-making scenarios. We propose a layered architecture for Context Augmented Knowledge Graphs-based Decision Support Systems with a Knowledge Layer that operates under a Dual Open World Assumption (DOWA). Under DOWA, the evaluation of the truthfulness of the information obtained from KGs depends on the context of its claims and the tasks carried out or intended (purpose). The Knowledge Layer comprises a Context Augmented KG (CoaKG) and a CoaKG Query Engine. The CoaKG contains contextual mappings to identify explicit context and rules to infer implicit context. The CoaKG Query Engine is designed as a query-answering approach that retrieves all contextualized answers from the CoaKG. A Proof of Concept (PoC) based on Wikidata was developed to evaluate the effectiveness of the Knowledge Layer.

Cite as

Veronica dos Santos, Daniel Schwabe, Altigran Soares da Silva, and Sérgio Lifschitz. CoaKG: A Contextualized Knowledge Graph Approach for Exploratory Search and Decision Making. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 4:1-4:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{dossantos_et_al:TGDK.3.1.4,
  author =	{dos Santos, Veronica and Schwabe, Daniel and da Silva, Altigran Soares and Lifschitz, S\'{e}rgio},
  title =	{{CoaKG: A Contextualized Knowledge Graph Approach for Exploratory Search and Decision Making}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:27},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.4},
  URN =		{urn:nbn:de:0030-drops-236685},
  doi =		{10.4230/TGDK.3.1.4},
  annote =	{Keywords: Knowledge Graphs, Context Search, Decision Support}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2025.49

Submodular Hypergraph Partitioning: Metric Relaxations and Fast Algorithms via an Improved Cut-Matching Game

Authors: Antares Chen, Lorenzo Orecchia, and Erasmo Tani

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)

Abstract

Despite there being significant work on developing spectral- [Chan et al., 2018; Lau et al., 2023; Kwok et al., 2022], and metric-embedding-based [Louis and Makarychev, 2016] approximation algorithms for hypergraph conductance, little is known regarding the approximability of other hypergraph partitioning objectives. This work proposes algorithms for a general model of hypergraph partitioning that unifies both undirected and directed versions of many well-studied partitioning objectives. The first contribution of this paper introduces polymatroidal cut functions, a large class of cut functions amenable to approximation algorithms via metric embeddings and routing multicommodity flows. We demonstrate a simple O(√{log n})-approximation, where n is the number of vertices in the hypergraph, for these problems by rounding relaxations to metrics of negative-type. The second contribution of this paper generalizes the cut-matching game framework of Khandekar et al. [Khandekar et al., 2007] to tackle polymatroidal cut functions. This yields an almost-linear time O(log n)-approximation algorithm for standard versions of undirected and directed hypergraph partitioning [Kwok et al., 2022]. A technical contribution of our construction is a novel cut-matching game, which greatly relaxes the set of allowed actions by the cut player and allows for the use of approximate s-t maximum flows by the matching player. We believe this to be of independent interest.

Cite as

Antares Chen, Lorenzo Orecchia, and Erasmo Tani. Submodular Hypergraph Partitioning: Metric Relaxations and Fast Algorithms via an Improved Cut-Matching Game. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 49:1-49:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ICALP.2025.49,
  author =	{Chen, Antares and Orecchia, Lorenzo and Tani, Erasmo},
  title =	{{Submodular Hypergraph Partitioning: Metric Relaxations and Fast Algorithms via an Improved Cut-Matching Game}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{49:1--49:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.49},
  URN =		{urn:nbn:de:0030-drops-234261},
  doi =		{10.4230/LIPIcs.ICALP.2025.49},
  annote =	{Keywords: Hypergraph Partitioning, Cut Improvement, Cut-Matching Game}
}

Document

Survey

DOI: 10.4230/TGDK.3.1.3

Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1

Abstract

Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}

Document

Resource

DOI: 10.4230/TGDK.3.1.1

KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems

Authors: Nora Abdelmageed, Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, and Birgitta König-Ries

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1

Abstract

Tabular data, often in the form of CSV files, plays a pivotal role in data analytics pipelines. Understanding this data semantically, known as Semantic Table Interpretation (STI), is crucial but poses challenges due to several factors such as the ambiguity of labels. As a result, STI has gained increasing attention from the community in the past few years. Evaluating STI systems requires well-established benchmarks. Most of the existing large-scale benchmarks are derived from general domain sources and focus on ambiguity, while domain-specific benchmarks are relatively small in size. This paper introduces KG2Tables, a framework that can construct domain-specific large-scale benchmarks from a Knowledge Graph (KG). KG2Tables leverages the internal hierarchy of the relevant KG concepts and their properties. As a proof of concept, we have built large datasets in the food, biodiversity, and biomedical domains. The resulting datasets, tFood, tBiomed, and tBiodiv, have been made available for the public in the ISWC SemTab challenge (2023 and 2024 editions). We include the evaluation results of top-performing STI systems using tFood Such results underscore its potential as a robust evaluation benchmark for challenging STI systems. We demonstrate the data quality level using a sample-based approach for the generated benchmarks including, for example, realistic tables assessment. Nevertheless, we provide an extensive discussion of KG2Tables explaining how it could be used to create other benchmarks from any domain of interest and including its key features and limitations with suggestions to overcome them.

Cite as

Nora Abdelmageed, Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, and Birgitta König-Ries. KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 1:1-1:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{abdelmageed_et_al:TGDK.3.1.1,
  author =	{Abdelmageed, Nora and Jim\'{e}nez-Ruiz, Ernesto and Hassanzadeh, Oktie and K\"{o}nig-Ries, Birgitta},
  title =	{{KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:28},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.1},
  URN =		{urn:nbn:de:0030-drops-230104},
  doi =		{10.4230/TGDK.3.1.1},
  annote =	{Keywords: Semantic Table Interpretation (STI), Knowledge Graph (KG), STI Benchmark, Food, Biodiversity, Biomedical}
}

Document

Resource Paper

DOI: 10.4230/TGDK.2.2.4

FAIR Jupyter: A Knowledge Graph Approach to Semantic Sharing and Granular Exploration of a Computational Notebook Reproducibility Dataset

Authors: Sheeba Samuel and Daniel Mietchen

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2

Abstract

The way in which data are shared can affect their utility and reusability. Here, we demonstrate how data that we had previously shared in bulk can be mobilized further through a knowledge graph that allows for much more granular exploration and interrogation. The original dataset is about the computational reproducibility of GitHub-hosted Jupyter notebooks associated with biomedical publications. It contains rich metadata about the publications, associated GitHub repositories and Jupyter notebooks, and the notebooks' reproducibility. We took this dataset, converted it into semantic triples and loaded these into a triple store to create a knowledge graph - FAIR Jupyter - that we made accessible via a web service. This enables granular data exploration and analysis through queries that can be tailored to specific use cases. Such queries may provide details about any of the variables from the original dataset, highlight relationships between them or combine some of the graph’s content with materials from corresponding external resources. We provide a collection of example queries addressing a range of use cases in research and education. We also outline how sets of such queries can be used to profile specific content types, either individually or by class. We conclude by discussing how such a semantically enhanced sharing of complex datasets can both enhance their FAIRness - i.e., their findability, accessibility, interoperability, and reusability - and help identify and communicate best practices, particularly with regards to data quality, standardization, automation and reproducibility.

Cite as

Sheeba Samuel and Daniel Mietchen. FAIR Jupyter: A Knowledge Graph Approach to Semantic Sharing and Granular Exploration of a Computational Notebook Reproducibility Dataset. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 4:1-4:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{samuel_et_al:TGDK.2.2.4,
  author =	{Samuel, Sheeba and Mietchen, Daniel},
  title =	{{FAIR Jupyter: A Knowledge Graph Approach to Semantic Sharing and Granular Exploration of a Computational Notebook Reproducibility Dataset}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:24},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.4},
  URN =		{urn:nbn:de:0030-drops-225886},
  doi =		{10.4230/TGDK.2.2.4},
  annote =	{Keywords: Knowledge Graph, Computational reproducibility, Jupyter notebooks, FAIR data, PubMed Central, GitHub, Python, SPARQL}
}

Document

DOI: 10.4230/TGDK.2.3.1

Unified Multimedia Segmentation - A Comprehensive Model for URI-based Media Segment Representation

Authors: Jan Willi, Abraham Bernstein, and Luca Rossetto

Published in: TGDK, Volume 2, Issue 3 (2024). Transactions on Graph Data and Knowledge, Volume 2, Issue 3

Abstract

In multimedia annotation, referencing specific segments of a document is often desired due to its richness and multimodality, but no universal representation for such references exists. This significantly hampers the usage of multimedia content in knowledge graphs, as it is modeled as one large atomic information container. Unstructured data - such as text, audio, images, and video - can commonly be decomposed into its constituent parts, as such documents rarely contain only one semantic concept. Hence, it is reasonable to assume that these advances will make it possible to decompose these previous atomic components into logical segments. To be processable by the knowledge graph stack, however, one needs to break the atomic nature of multimedia content, providing a mechanism to address media segments. This paper proposes a Unified Segmentation Model capable of depicting arbitrary segmentations on any media document type. The work begins with a formal analysis of multimedia and segmentation, exploring segmentation operations and how to describe them. Building on this analysis, it then develops a practical scheme for expressing segmentation in Uniform Resource Identifiers (URIs). Given that this approach makes segments of multimedia content referencable, it breaks their atomic nature and makes them first-class citizens within knowledge graphs. The proposed model is implemented as a proof of concept in the MediaGraph Store, a multimedia knowledge graph storage and querying engine.

Cite as

Jan Willi, Abraham Bernstein, and Luca Rossetto. Unified Multimedia Segmentation - A Comprehensive Model for URI-based Media Segment Representation. In Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 3, pp. 1:1-1:34, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{willi_et_al:TGDK.2.3.1,
  author =	{Willi, Jan and Bernstein, Abraham and Rossetto, Luca},
  title =	{{Unified Multimedia Segmentation - A Comprehensive Model for URI-based Media Segment Representation}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:34},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.3.1},
  URN =		{urn:nbn:de:0030-drops-225953},
  doi =		{10.4230/TGDK.2.3.1},
  annote =	{Keywords: Multimodal Knowledge Graphs, Multimedia Segmentation, Multimedia Representation}
}

Document

Resource Paper

DOI: 10.4230/TGDK.2.2.5

The Reasonable Ontology Templates Framework

Authors: Martin Georg Skjæveland and Leif Harald Karlsen

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2

Abstract

Reasonable Ontology Templates (OTTR) is a templating language for representing and instantiating patterns. It is based on simple and generic, but powerful, mechanisms such as recursive macro expansion, term substitution and type systems, and is designed particularly for building and maintaining RDF knowledge graphs and OWL ontologies. In this resource paper, we present the formal specifications that define the OTTR framework. This includes the fundamentals of the OTTR language and the adaptions to make it fit with standard semantic web languages, and two serialization formats developed for semantic web practitioners. We also present the OTTR framework’s support for documenting, publishing and managing template libraries, and for tools for practical bulk instantiation of templates from tabular data and queryable data sources. The functionality of the OTTR framework is available for use through Lutra, an open-source reference implementation, and other independent implementations. We report on the use and impact of OTTR by presenting selected industrial use cases. Finally, we reflect on some design considerations of the language and framework and present ideas for future work.

Cite as

Martin Georg Skjæveland and Leif Harald Karlsen. The Reasonable Ontology Templates Framework. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 5:1-5:54, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{skjaeveland_et_al:TGDK.2.2.5,
  author =	{Skj{\ae}veland, Martin Georg and Karlsen, Leif Harald},
  title =	{{The Reasonable Ontology Templates Framework}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:54},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.5},
  URN =		{urn:nbn:de:0030-drops-225896},
  doi =		{10.4230/TGDK.2.2.5},
  annote =	{Keywords: Ontology engineering, Ontology design patterns, Template mechanism, Macros}
}

Document

Resource Paper

DOI: 10.4230/TGDK.2.2.3

The dblp Knowledge Graph and SPARQL Endpoint

Authors: Marcel R. Ackermann, Hannah Bast, Benedikt Maria Beckermann, Johannes Kalmbach, Patrick Neises, and Stefan Ollinger

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2

Abstract

For more than 30 years, the dblp computer science bibliography has provided quality-checked and curated bibliographic metadata on major computer science journals, proceedings, and monographs. Its semantic content has been published as RDF or similar graph data by third parties in the past, but most of these resources have now disappeared from the web or are no longer actively synchronized with the latest dblp data. In this article, we introduce the dblp Knowledge Graph (dblp KG), the first semantic representation of the dblp data that is designed and maintained by the dblp team. The dataset is augmented by citation data from the OpenCitations corpus. Open and FAIR access to the data is provided via daily updated RDF dumps, persistently archived monthly releases, a new public SPARQL endpoint with a powerful user interface, and a linked open data API. We also make it easy to self-host a replica of our SPARQL endpoint. We provide an introduction on how to work with the dblp KG and the added citation data using our SPARQL endpoint, with several example queries. Finally, we present the results of a small performance evaluation.

Cite as

Marcel R. Ackermann, Hannah Bast, Benedikt Maria Beckermann, Johannes Kalmbach, Patrick Neises, and Stefan Ollinger. The dblp Knowledge Graph and SPARQL Endpoint. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 3:1-3:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{ackermann_et_al:TGDK.2.2.3,
  author =	{Ackermann, Marcel R. and Bast, Hannah and Beckermann, Benedikt Maria and Kalmbach, Johannes and Neises, Patrick and Ollinger, Stefan},
  title =	{{The dblp Knowledge Graph and SPARQL Endpoint}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:23},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.3},
  URN =		{urn:nbn:de:0030-drops-225870},
  doi =		{10.4230/TGDK.2.2.3},
  annote =	{Keywords: dblp, Scholarly Knowledge Graph, Resource, RDF, SPARQL}
}

Document

Survey

DOI: 10.4230/TGDK.2.1.3

Semantic Web: Past, Present, and Future

Authors: Ansgar Scherp, Gerd Groener, Petr Škoda, Katja Hose, and Maria-Esther Vidal

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1

Abstract

Ever since the vision was formulated, the Semantic Web has inspired many generations of innovations. Semantic technologies have been used to share vast amounts of information on the Web, enhance them with semantics to give them meaning, and enable inference and reasoning on them. Throughout the years, semantic technologies, and in particular knowledge graphs, have been used in search engines, data integration, enterprise settings, and machine learning. In this paper, we recap the classical concepts and foundations of the Semantic Web as well as modern and recent concepts and applications, building upon these foundations. The classical topics we cover include knowledge representation, creating and validating knowledge on the Web, reasoning and linking, and distributed querying. We enhance this classical view of the so-called "Semantic Web Layer Cake" with an update of recent concepts that include provenance, security and trust, as well as a discussion of practical impacts from industry-led contributions. We conclude with an outlook on the future directions of the Semantic Web. This is a living document. If you like to contribute, please contact the first author and visit: https://github.com/ascherp/semantic-web-primer

Cite as

Ansgar Scherp, Gerd Groener, Petr Škoda, Katja Hose, and Maria-Esther Vidal. Semantic Web: Past, Present, and Future. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 3:1-3:37, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{scherp_et_al:TGDK.2.1.3,
  author =	{Scherp, Ansgar and Groener, Gerd and \v{S}koda, Petr and Hose, Katja and Vidal, Maria-Esther},
  title =	{{Semantic Web: Past, Present, and Future}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:37},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.3},
  URN =		{urn:nbn:de:0030-drops-198607},
  doi =		{10.4230/TGDK.2.1.3},
  annote =	{Keywords: Linked Open Data, Semantic Web Graphs, Knowledge Graphs}
}

23 Search Results for "Sequeda, Juan F."

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message