41 Search Results for "TGDK"

Document/HTML   ×
Document
Research
A Logic Programming Approach to Repairing SHACL Constraint Violations

Authors: Shqiponja Ahmetaj, Robert David, Axel Polleres, and Mantas Šimkus

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
The Shapes Constraint Language (SHACL) is a recent standard, a W3C recommendation, for validating RDF graphs against shape constraints to be checked on target nodes of a data graph. The standard also describes the notion of validation reports, which detail the results of the validation process. In case of violation of constraints, the validation report should explain the reasons for non-validation, offering guidance on how to identify or fix violations in the data graph. Since the specification left it open to SHACL processors to define such explanations, a recent work proposed the use of explanations in the style of database repairs, where a repair is a set of additions to or deletions from the data graph so that the resulting graph validates against the constraints. In this paper, we study such repairs for non-recursive SHACL, the largest fragment of SHACL that is fully defined in the specification. We propose an algorithm to compute repairs by encoding the explanation problem - using Answer Set Programming (ASP) - into a logic program, where the answer sets contain (minimal) repairs. We then study a scenario where it is not possible to simultaneously repair all the targets, which may be the case due to overall unsatisfiability or conflicting constraints. We introduce a relaxed notion of validation, which allows to validate a (maximal) subset of the targets and adapt the ASP translation to take into account this relaxation. Finally, we add support for repairing constraints which use property paths and equality of paths. Our implementation in clingo is - to the best of our knowledge - the first implementation of a repair program for SHACL.

Cite as

Shqiponja Ahmetaj, Robert David, Axel Polleres, and Mantas Šimkus. A Logic Programming Approach to Repairing SHACL Constraint Violations. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 1:1-1:36, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{ahmetaj_et_al:TGDK.3.3.1,
  author =	{Ahmetaj, Shqiponja and David, Robert and Polleres, Axel and \v{S}imkus, Mantas},
  title =	{{A Logic Programming Approach to Repairing SHACL Constraint Violations}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:36},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.1},
  URN =		{urn:nbn:de:0030-drops-252124},
  doi =		{10.4230/TGDK.3.3.1},
  annote =	{Keywords: SHACL, Shapes Constraint Language, Database Repairs, Knowledge Graphs, Semantic Web, Answer Set Programming}
}
Document
Use Case
Automating Invoice Validation with Knowledge Graphs: Optimizations and Practical Lessons

Authors: Johannes Mäkelburg and Maribel Acosta

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
To increase the efficiency of creating, distributing, and processing of invoices, invoicing is handled in the form of Electronic Data Interchange (EDI). With EDI, invoices are handled in a standardized electronic or digital format rather than on paper. While EDIFACT is widely used for electronic invoicing, there is no standardized approach for validating its content. In this work, we tackle the problem of automatically validating electronic invoices in the EDIFACT format by leveraging KG technologies. We build on a previously developed pipeline that transforms EDIFACT invoices into RDF knowledge graphs (KGs). The resulting graphs are validated using SHACL constraints defined in collaboration with domain experts. In this work, we improve the pipeline by enhancing the correctness of the invoice representation, reducing validation time, and introducing error prioritization through the use of the severity predicate in SHACL. These improvements make validation results easier to interpret and significantly reduce the manual effort required. Our evaluation confirms that the approach is correct, efficient, and practical for real-world use.

Cite as

Johannes Mäkelburg and Maribel Acosta. Automating Invoice Validation with Knowledge Graphs: Optimizations and Practical Lessons. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 2:1-2:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{makelburg_et_al:TGDK.3.3.2,
  author =	{M\"{a}kelburg, Johannes and Acosta, Maribel},
  title =	{{Automating Invoice Validation with Knowledge Graphs: Optimizations and Practical Lessons}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.2},
  URN =		{urn:nbn:de:0030-drops-252137},
  doi =		{10.4230/TGDK.3.3.2},
  annote =	{Keywords: Electronic Invoice, Ontology, EDIFACT, RDF, RML, SHACL}
}
Document
Resource
Supporting Psychometric Instrument Usage Through the POEM Ontology

Authors: Kelsey Rook, Henrique Santos, Deborah L. McGuinness, Manuel S. Sprung, Paulo Pinheiro, and Bruce F. Chorpita

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
Psychometrics is the field relating to the measurement of concepts within psychology, particularly the assessment of various social and psychological dimensions in humans. The relationship between psychometric entities is critical to finding an appropriate assessment instrument, especially in the context of clinical psychology and mental healthcare in which providing the best care based on empirical evidence is crucial. We aim to model these entities, which include psychometric questionnaires and their component elements, the subject and respondent, and the latent variables being assessed. The current standard for questionnaire-based assessment relies on text-based distributions of instruments; so, a structured representation is necessary to capture these relationships to enhance accessibility and use of existing measures, encourage reuse of questionnaires and their component elements, and enable sophisticated reasoning over assessment instruments and results by increasing interoperability. We present the design process and architecture of such a domain ontology, the Psychometric Ontology of Experiences and Measures, situating it within the context of related ontologies, and demonstrating its practical utility through evaluation against a series of competency questions concerning the creation, use, and reuse of psychometric questionnaires in clinical, research, and development settings.

Cite as

Kelsey Rook, Henrique Santos, Deborah L. McGuinness, Manuel S. Sprung, Paulo Pinheiro, and Bruce F. Chorpita. Supporting Psychometric Instrument Usage Through the POEM Ontology. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{rook_et_al:TGDK.3.3.3,
  author =	{Rook, Kelsey and Santos, Henrique and McGuinness, Deborah L. and Sprung, Manuel S. and Pinheiro, Paulo and Chorpita, Bruce F.},
  title =	{{Supporting Psychometric Instrument Usage Through the POEM Ontology}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:19},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.3},
  URN =		{urn:nbn:de:0030-drops-252148},
  doi =		{10.4230/TGDK.3.3.3},
  annote =	{Keywords: ontology, ontology development, psychometric assessment, psychometric ontology}
}
Document
Research
Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}
Document
Use Case
LLM-Supported Manufacturing Mapping Generation

Authors: Wilma Johanna Schmidt, Irlan Grangel-González, Adrian Paschke, and Evgeny Kharlamov

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
In large manufacturing companies, such as Bosch, that operate thousands of production lines with each comprising up to dozens of production machines and other equipment, even simple inventory questions such as of location and quantities of a particular equipment type require non-trivial solutions. Addressing these questions requires to integrate multiple heterogeneous data sets which is time consuming and error prone and demands domain as well as knowledge experts. Knowledge graphs (KGs) are practical for consolidating inventory data by bringing it into the same format and linking inventory items. However, the KG creation and maintenance itself pose challenges as mappings are needed to connect data sets and ontologies. In this work, we address these challenges by exploring LLM-supported and context-enhanced generation of both YARRRML and RML mappings. Facing large ontologies in the manufacturing domain and token limitations in LLM prompts, we further evaluate ontology reduction methods in our approach. We evaluate our approach both quantitatively against reference mappings created manually by experts and, for YARRRML, also qualitatively with expert feedback. This work extends the exploration of the challenges with LLM-supported and context-enhanced mapping generation YARRRML [Schmidt et al., 2025] by comprehensive analyses on RML mappings and an ontology reduction evaluation. We further publish the source code of this work. Our work provides a valuable support when creating manufacturing mappings and supports data and schema updates.

Cite as

Wilma Johanna Schmidt, Irlan Grangel-González, Adrian Paschke, and Evgeny Kharlamov. LLM-Supported Manufacturing Mapping Generation. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 5:1-5:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{schmidt_et_al:TGDK.3.3.5,
  author =	{Schmidt, Wilma Johanna and Grangel-Gonz\'{a}lez, Irlan and Paschke, Adrian and Kharlamov, Evgeny},
  title =	{{LLM-Supported Manufacturing Mapping Generation}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:22},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.5},
  URN =		{urn:nbn:de:0030-drops-252164},
  doi =		{10.4230/TGDK.3.3.5},
  annote =	{Keywords: Mapping Generation, Knowledge Graph Construction, Ontology Reduction, RML, YARRRML, LLM, Manufacturing}
}
Document
Survey
Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}
Document
Research
GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}
Document
Research
Distances Between Formal Concept Analysis Structures

Authors: Alexandre Bazin and Giacomo Kahn

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
In this paper, we study the notion of distance between the most important structures of formal concept analysis: formal contexts, concept lattices, and implication bases. We first define three families of Minkowski-like distances between these three structures. We then present experiments showing that the correlations of these distances are low and depend on the distance between formal contexts.

Cite as

Alexandre Bazin and Giacomo Kahn. Distances Between Formal Concept Analysis Structures. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 2:1-2:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{bazin_et_al:TGDK.3.2.2,
  author =	{Bazin, Alexandre and Kahn, Giacomo},
  title =	{{Distances Between Formal Concept Analysis Structures}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:18},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.2},
  URN =		{urn:nbn:de:0030-drops-248126},
  doi =		{10.4230/TGDK.3.2.2},
  annote =	{Keywords: Formal Concept Analysis, Implication Base, Concept Lattice, Pattern Mining, Ordinal Data Science}
}
Document
Research
CoaKG: A Contextualized Knowledge Graph Approach for Exploratory Search and Decision Making

Authors: Veronica dos Santos, Daniel Schwabe, Altigran Soares da Silva, and Sérgio Lifschitz

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
In decision-making scenarios, an information need arises due to a knowledge gap when a decision-maker needs more knowledge to make a decision. Users may take the initiative to acquire knowledge to fill this gap through exploratory search approaches using Knowledge Graphs (KGs) as information sources, but their queries can be incomplete, inaccurate, and ambiguous. Although KGs have great potential for exploratory search, they are incomplete by nature. Besides, for both Crowd-sourced KGs and KGs constructed by integrating several different information sources of varying quality to be effectively consumed, there is a need for a Trust Layer. Our research aims to enrich and allow querying KGs to support context-aware exploration in decision-making scenarios. We propose a layered architecture for Context Augmented Knowledge Graphs-based Decision Support Systems with a Knowledge Layer that operates under a Dual Open World Assumption (DOWA). Under DOWA, the evaluation of the truthfulness of the information obtained from KGs depends on the context of its claims and the tasks carried out or intended (purpose). The Knowledge Layer comprises a Context Augmented KG (CoaKG) and a CoaKG Query Engine. The CoaKG contains contextual mappings to identify explicit context and rules to infer implicit context. The CoaKG Query Engine is designed as a query-answering approach that retrieves all contextualized answers from the CoaKG. A Proof of Concept (PoC) based on Wikidata was developed to evaluate the effectiveness of the Knowledge Layer.

Cite as

Veronica dos Santos, Daniel Schwabe, Altigran Soares da Silva, and Sérgio Lifschitz. CoaKG: A Contextualized Knowledge Graph Approach for Exploratory Search and Decision Making. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 4:1-4:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{dossantos_et_al:TGDK.3.1.4,
  author =	{dos Santos, Veronica and Schwabe, Daniel and da Silva, Altigran Soares and Lifschitz, S\'{e}rgio},
  title =	{{CoaKG: A Contextualized Knowledge Graph Approach for Exploratory Search and Decision Making}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:27},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.4},
  URN =		{urn:nbn:de:0030-drops-236685},
  doi =		{10.4230/TGDK.3.1.4},
  annote =	{Keywords: Knowledge Graphs, Context Search, Decision Support}
}
Document
Survey
Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}
Document
Resource
KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems

Authors: Nora Abdelmageed, Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, and Birgitta König-Ries

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Tabular data, often in the form of CSV files, plays a pivotal role in data analytics pipelines. Understanding this data semantically, known as Semantic Table Interpretation (STI), is crucial but poses challenges due to several factors such as the ambiguity of labels. As a result, STI has gained increasing attention from the community in the past few years. Evaluating STI systems requires well-established benchmarks. Most of the existing large-scale benchmarks are derived from general domain sources and focus on ambiguity, while domain-specific benchmarks are relatively small in size. This paper introduces KG2Tables, a framework that can construct domain-specific large-scale benchmarks from a Knowledge Graph (KG). KG2Tables leverages the internal hierarchy of the relevant KG concepts and their properties. As a proof of concept, we have built large datasets in the food, biodiversity, and biomedical domains. The resulting datasets, tFood, tBiomed, and tBiodiv, have been made available for the public in the ISWC SemTab challenge (2023 and 2024 editions). We include the evaluation results of top-performing STI systems using tFood Such results underscore its potential as a robust evaluation benchmark for challenging STI systems. We demonstrate the data quality level using a sample-based approach for the generated benchmarks including, for example, realistic tables assessment. Nevertheless, we provide an extensive discussion of KG2Tables explaining how it could be used to create other benchmarks from any domain of interest and including its key features and limitations with suggestions to overcome them.

Cite as

Nora Abdelmageed, Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, and Birgitta König-Ries. KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 1:1-1:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{abdelmageed_et_al:TGDK.3.1.1,
  author =	{Abdelmageed, Nora and Jim\'{e}nez-Ruiz, Ernesto and Hassanzadeh, Oktie and K\"{o}nig-Ries, Birgitta},
  title =	{{KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:28},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.1},
  URN =		{urn:nbn:de:0030-drops-230104},
  doi =		{10.4230/TGDK.3.1.1},
  annote =	{Keywords: Semantic Table Interpretation (STI), Knowledge Graph (KG), STI Benchmark, Food, Biodiversity, Biomedical}
}
Document
Research
Talking Wikidata: Communication Patterns and Their Impact on Community Engagement in Collaborative Knowledge Graphs

Authors: Elisavet Koutsiana, Ioannis Reklos, Kholoud Saad Alghamdi, Nitisha Jain, Albert Meroño-Peñuela, and Elena Simperl

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
We study collaboration patterns of Wikidata, one of the world's largest open source collaborative knowledge graph (KG) communities. Collaborative KG communities, play a key role in structuring machine-readable knowledge to support AI systems like conversational agents. However, these communities face challenges related to long-term member engagement, as a small subset of contributors often is responsible for the majority of contributions and decision-making. While prior research has explored contributors' roles and lifespans, discussions within collaborative KG communities remain understudied. To fill this gap, we investigated the behavioural patterns of contributors and factors affecting their communication and participation. We analysed all the discussions on Wikidata using a mixed methods approach, including statistical tests, network analysis, and text and graph embedding representations. Our findings reveal that the interactions between Wikidata editors form a small world network, resilient to dropouts and inclusive, where both the network topology and discussion content influence the continuity of conversations. Furthermore, the account age of Wikidata members and their conversations are significant factors in their long-term engagement with the project. Our observations and recommendations can benefit the Wikidata and semantic web communities, providing guidance on how to improve collaborative environments for sustainability, growth, and quality.

Cite as

Elisavet Koutsiana, Ioannis Reklos, Kholoud Saad Alghamdi, Nitisha Jain, Albert Meroño-Peñuela, and Elena Simperl. Talking Wikidata: Communication Patterns and Their Impact on Community Engagement in Collaborative Knowledge Graphs. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 2:1-2:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{koutsiana_et_al:TGDK.3.1.2,
  author =	{Koutsiana, Elisavet and Reklos, Ioannis and Alghamdi, Kholoud Saad and Jain, Nitisha and Mero\~{n}o-Pe\~{n}uela, Albert and Simperl, Elena},
  title =	{{Talking Wikidata: Communication Patterns and Their Impact on Community Engagement in Collaborative Knowledge Graphs}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:27},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.2},
  URN =		{urn:nbn:de:0030-drops-230114},
  doi =		{10.4230/TGDK.3.1.2},
  annote =	{Keywords: collaborative knowledge graph, network analysis, graph embeddings, text embeddings}
}
Document
Unified Multimedia Segmentation - A Comprehensive Model for URI-based Media Segment Representation

Authors: Jan Willi, Abraham Bernstein, and Luca Rossetto

Published in: TGDK, Volume 2, Issue 3 (2024). Transactions on Graph Data and Knowledge, Volume 2, Issue 3


Abstract
In multimedia annotation, referencing specific segments of a document is often desired due to its richness and multimodality, but no universal representation for such references exists. This significantly hampers the usage of multimedia content in knowledge graphs, as it is modeled as one large atomic information container. Unstructured data - such as text, audio, images, and video - can commonly be decomposed into its constituent parts, as such documents rarely contain only one semantic concept. Hence, it is reasonable to assume that these advances will make it possible to decompose these previous atomic components into logical segments. To be processable by the knowledge graph stack, however, one needs to break the atomic nature of multimedia content, providing a mechanism to address media segments. This paper proposes a Unified Segmentation Model capable of depicting arbitrary segmentations on any media document type. The work begins with a formal analysis of multimedia and segmentation, exploring segmentation operations and how to describe them. Building on this analysis, it then develops a practical scheme for expressing segmentation in Uniform Resource Identifiers (URIs). Given that this approach makes segments of multimedia content referencable, it breaks their atomic nature and makes them first-class citizens within knowledge graphs. The proposed model is implemented as a proof of concept in the MediaGraph Store, a multimedia knowledge graph storage and querying engine.

Cite as

Jan Willi, Abraham Bernstein, and Luca Rossetto. Unified Multimedia Segmentation - A Comprehensive Model for URI-based Media Segment Representation. In Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 3, pp. 1:1-1:34, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{willi_et_al:TGDK.2.3.1,
  author =	{Willi, Jan and Bernstein, Abraham and Rossetto, Luca},
  title =	{{Unified Multimedia Segmentation - A Comprehensive Model for URI-based Media Segment Representation}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:34},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.3.1},
  URN =		{urn:nbn:de:0030-drops-225953},
  doi =		{10.4230/TGDK.2.3.1},
  annote =	{Keywords: Multimodal Knowledge Graphs, Multimedia Segmentation, Multimedia Representation}
}
Document
Resource Paper
The Reasonable Ontology Templates Framework

Authors: Martin Georg Skjæveland and Leif Harald Karlsen

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2


Abstract
Reasonable Ontology Templates (OTTR) is a templating language for representing and instantiating patterns. It is based on simple and generic, but powerful, mechanisms such as recursive macro expansion, term substitution and type systems, and is designed particularly for building and maintaining RDF knowledge graphs and OWL ontologies. In this resource paper, we present the formal specifications that define the OTTR framework. This includes the fundamentals of the OTTR language and the adaptions to make it fit with standard semantic web languages, and two serialization formats developed for semantic web practitioners. We also present the OTTR framework’s support for documenting, publishing and managing template libraries, and for tools for practical bulk instantiation of templates from tabular data and queryable data sources. The functionality of the OTTR framework is available for use through Lutra, an open-source reference implementation, and other independent implementations. We report on the use and impact of OTTR by presenting selected industrial use cases. Finally, we reflect on some design considerations of the language and framework and present ideas for future work.

Cite as

Martin Georg Skjæveland and Leif Harald Karlsen. The Reasonable Ontology Templates Framework. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 5:1-5:54, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{skjaeveland_et_al:TGDK.2.2.5,
  author =	{Skj{\ae}veland, Martin Georg and Karlsen, Leif Harald},
  title =	{{The Reasonable Ontology Templates Framework}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:54},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.5},
  URN =		{urn:nbn:de:0030-drops-225896},
  doi =		{10.4230/TGDK.2.2.5},
  annote =	{Keywords: Ontology engineering, Ontology design patterns, Template mechanism, Macros}
}
Document
Resource Paper
The dblp Knowledge Graph and SPARQL Endpoint

Authors: Marcel R. Ackermann, Hannah Bast, Benedikt Maria Beckermann, Johannes Kalmbach, Patrick Neises, and Stefan Ollinger

Published in: TGDK, Volume 2, Issue 2 (2024): Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 2, Issue 2


Abstract
For more than 30 years, the dblp computer science bibliography has provided quality-checked and curated bibliographic metadata on major computer science journals, proceedings, and monographs. Its semantic content has been published as RDF or similar graph data by third parties in the past, but most of these resources have now disappeared from the web or are no longer actively synchronized with the latest dblp data. In this article, we introduce the dblp Knowledge Graph (dblp KG), the first semantic representation of the dblp data that is designed and maintained by the dblp team. The dataset is augmented by citation data from the OpenCitations corpus. Open and FAIR access to the data is provided via daily updated RDF dumps, persistently archived monthly releases, a new public SPARQL endpoint with a powerful user interface, and a linked open data API. We also make it easy to self-host a replica of our SPARQL endpoint. We provide an introduction on how to work with the dblp KG and the added citation data using our SPARQL endpoint, with several example queries. Finally, we present the results of a small performance evaluation.

Cite as

Marcel R. Ackermann, Hannah Bast, Benedikt Maria Beckermann, Johannes Kalmbach, Patrick Neises, and Stefan Ollinger. The dblp Knowledge Graph and SPARQL Endpoint. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 3:1-3:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{ackermann_et_al:TGDK.2.2.3,
  author =	{Ackermann, Marcel R. and Bast, Hannah and Beckermann, Benedikt Maria and Kalmbach, Johannes and Neises, Patrick and Ollinger, Stefan},
  title =	{{The dblp Knowledge Graph and SPARQL Endpoint}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:23},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.2.3},
  URN =		{urn:nbn:de:0030-drops-225870},
  doi =		{10.4230/TGDK.2.2.3},
  annote =	{Keywords: dblp, Scholarly Knowledge Graph, Resource, RDF, SPARQL}
}
  • Refine by Type
  • Document/HTML
  • 41 Document/PDF

  • Refine by Publication Year
  • 12 2025
  • 16 2024
  • 13 2023

  • Refine by Author
  • 3 Biswas, Russa
  • 3 Calbimonte, Jean-Paul
  • 3 Jiménez-Ruiz, Ernesto
  • 3 Lissandrini, Matteo
  • 3 Scherp, Ansgar
  • Show More...

  • Refine by Series/Journal
  • 41 TGDK

  • Refine by Classification
  • 11 Computing methodologies → Knowledge representation and reasoning
  • 9 Information systems → Graph-based database models
  • 6 Computing methodologies → Ontology engineering
  • 6 Information systems → Semantic web description languages
  • 5 Computing methodologies → Semantic networks
  • Show More...

  • Refine by Keyword
  • 7 Knowledge Graphs
  • 4 Knowledge graphs
  • 4 OWL
  • 3 Large Language Models
  • 3 RDF
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail