15 Search Results for "Lopez, Raphael"


Document
Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support

Authors: Kaisheng Li and Richard S. Whittle

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)


Abstract
We propose a unified framework for an Earth‑independent AI system that provides explainable, context‑aware decision support for EVA mission planning by integrating six core components: a fine‑tuned EVA domain LLM, a retrieval‑augmented knowledge base, a short-term memory store, physical simulation models, an agentic orchestration layer, and a multimodal user interface. To ground our design, we analyze the current roles and substitution potential of the Mission Control Center - identifying which procedural and analytical functions can be automated onboard while preserving human oversight for experiential and strategic tasks. Building on this framework, we introduce RASAGE (Retrieval & Simulation Augmented Guidance Agent for Exploration), a proof‑of‑concept toolset that combines Microsoft Phi‑4‑mini‑instruct with a FAISS (Facebook AI Similarity Search)‑powered EVA knowledge base and custom A* path planning and hypogravity metabolic models to generate grounded, traceable EVA plans. We outline a staged validation strategy to evaluate improvements in route efficiency, metabolic prediction accuracy, anomaly response effectiveness, and crew trust under realistic communication delays. Our findings demonstrate the feasibility of replicating key Mission Control functions onboard, enhancing crew autonomy, reducing cognitive load, and improving safety for deep‑space exploration missions.

Cite as

Kaisheng Li and Richard S. Whittle. Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 6:1-6:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{li_et_al:OASIcs.SpaceCHI.2025.6,
  author =	{Li, Kaisheng and Whittle, Richard S.},
  title =	{{Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{6:1--6:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.6},
  URN =		{urn:nbn:de:0030-drops-239967},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.6},
  annote =	{Keywords: Human-AI Interaction for Space Exploration, Extravehicular Activities, Cognitive load and Human Performance Issues, Human Systems Exploration, Lunar Exploration, LLM}
}
Document
Extended Abstract
Partitioned Multi-MUM Finding for Scalable Pangenomics (Extended Abstract)

Authors: Vikram S. Shivakumar and Ben Langmead

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Pangenome collections continue to grow and proliferate to hundreds of high-quality genomes, for example, the expanded v2 version of the Human Pangenome Reference Consortium (HPRC) dataset spanning 474 human haplotypes [Liao et al., 2023]. As the size and complexity of these collections grow, it is increasingly important that our methods for studying and indexing pangenomes be scalable and updateable. Maximal Unique Matches (multi-MUMs), exact substring matches present exactly once in all sequences in a pangenome collection, represent conserved anchor sequences that can comprise a common coordinate system. We previously proposed a framework and tool called Mumemto for rapidly identifying multi-MUMs during construction of a compressed pangenome index [Shivakumar and Langmead, 2025]. Using prefix-free parsing (PFP) [Boucher et al., 2019], a compressed-space method for computing full-text indexes, Mumemto outperforms existing methods for identifying multi-MUMs. However, one drawback remains updateability and scalability. Mumemto can become memory-intensive for large pangenomes (> 300 human genomes), and as newly assembled genomes are added to a pangenome collection, Mumemto requires re-running on the entire updated collection. To address this, we developed a partition-merging approach to compute multi-MUMs with Mumemto. We introduce two strategies for merging of multi-MUMs computed across different collections (see Figure 1), enabling parallelization across partitions and simple computation of multi-MUMs for incrementally-updated collections. The first strategy requires a common sequence in each partition (which we call "anchor-based merging"), which serves as a coordinate system to identify multi-MUM overlaps between partitions. By tracking the next longest match for all multi-MUMs and unique matches (UMs) in an auxiliary data structure, intersections between matches can be filtered out if no longer unique in the union collection. The second strategy identifies overlaps directly from the multi-MUM substrings (called "string-based merging"). The overlaps are identified by running Mumemto over the extracted multi-MUM sequences and are similarly filtered out if they are too short to considered unique. Lastly, we propose an extension to anchor-based merging to enable the computation of partial multi-MUMs, present in only a subset of sequences in the union set. The partition-merging framework introduces a tradeoff space in Mumemto between running time and memory, depending on partition size and the number of threads. Running parallel, per-partition Mumemto processes and merging the results reduces the running time but increases the peak memory footprint, while running a single Mumemto thread over each partition serially yields longer running time but a smaller memory footprint. To evaluate this tradeoff, we computed multi-MUMs across 474 haplotypes of chr19 from the HPRC v2 dataset [Liao et al., 2023] and 69 assemblies of A. thaliana [Lian et al., 2024] (Table 1). The string-based method also enables merging multi-MUMs between disjoint collections, for example subclades in a phylogenetic tree. By merging multi-MUMs along the shape of the tree, we can compute matches at internal nodes of the tree along with the root, revealing clade-specific conservation and structural variation. Multi-MUM merging also enables interspecific match computation, which was previously infeasible with Mumemto due to high memory usage for highly-diverse input sequence collections. We use partition-merging to compute multi-MUMs across 29 primate assemblies, and found a correspondence to ultraconserved elements previously found across mammalian genomes [Cummins et al., 2024]. We show that a partitioned Mumemto enables scalability to growing pangenome collections and expands the applicability of Mumemto to larger, more diverse datasets. As a result, Mumemto is the only method capable of computing exact matches across the entire HPRC v2 dataset (474 haplotypes [Liao et al., 2023]), and can easily incorporate future releases of assemblies without recomputation. This increases the scope for exploration of genomic conservation and variation and highlights the potential for Mumemto as a core method for future pangenomics and comparative genomics research. The partitioned Mumemto framework is implemented in v1.3.0 and is available open-source at https://github.com/vikshiv/mumemto.

Cite as

Vikram S. Shivakumar and Ben Langmead. Partitioned Multi-MUM Finding for Scalable Pangenomics (Extended Abstract). In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 23:1-23:4, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{shivakumar_et_al:LIPIcs.WABI.2025.23,
  author =	{Shivakumar, Vikram S. and Langmead, Ben},
  title =	{{Partitioned Multi-MUM Finding for Scalable Pangenomics}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{23:1--23:4},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.23},
  URN =		{urn:nbn:de:0030-drops-239490},
  doi =		{10.4230/LIPIcs.WABI.2025.23},
  annote =	{Keywords: Pangenomics, Comparative genomics, Compressed indexing}
}
Document
DiVerG: Scalable Distance Index for Validation of Paired-End Alignments in Sequence Graphs

Authors: Ali Ghaffaari, Alexander Schönhuth, and Tobias Marschall

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Determining the distance between two loci within a genomic region is a recurrent operation in various tasks in computational genomics. A notable example of this task arises in paired-end read mapping as a form of validation of distances between multiple alignments. While straightforward for a single genome, graph-based reference structures render the operation considerably more involved. Given the sheer number of such queries in a typical read mapping experiment, an efficient algorithm for answering distance queries is crucial. In this paper, we introduce DiVerG, a compact data structure as well as a fast and scalable algorithm, for constructing distance indexes for general sequence graphs on multi-core CPU and many-core GPU architectures. DiVerG is based on PairG [Jain et al., 2019], but overcomes the limitations of PairG by exploiting the extensive potential for improvements in terms of scalability and space efficiency. As a consequence, DiVerG can process substantially larger datasets, such as whole human genomes, which are unmanageable by PairG. DiVerG offers faster index construction time and consistently faster query time with gains proportional to the size of the underlying compact data structure. We demonstrate that our method performs favorably on multiple real datasets at various scales. DiVerG achieves superior performance over PairG; e.g. resulting to 2.5-4x speed-up in query time, 44-340x smaller index size, and 3-50x faster construction time for the genome graph of the MHC region, as a particularly variable region of the human genome. The implementation is available at: https://github.com/cartoonist/diverg

Cite as

Ali Ghaffaari, Alexander Schönhuth, and Tobias Marschall. DiVerG: Scalable Distance Index for Validation of Paired-End Alignments in Sequence Graphs. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 10:1-10:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{ghaffaari_et_al:LIPIcs.WABI.2025.10,
  author =	{Ghaffaari, Ali and Sch\"{o}nhuth, Alexander and Marschall, Tobias},
  title =	{{DiVerG: Scalable Distance Index for Validation of Paired-End Alignments in Sequence Graphs}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{10:1--10:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.10},
  URN =		{urn:nbn:de:0030-drops-239369},
  doi =		{10.4230/LIPIcs.WABI.2025.10},
  annote =	{Keywords: Sequence graph, distance index, read mapping, sparse matrix}
}
Document
Invited Talk
Vehicle: Bridging the Embedding Gap in the Verification of Neuro-Symbolic Programs (Invited Talk)

Authors: Matthew L. Daggitt, Wen Kokke, Robert Atkey, Ekaterina Komendantskaya, Natalia Slusarz, and Luca Arnaboldi

Published in: LIPIcs, Volume 337, 10th International Conference on Formal Structures for Computation and Deduction (FSCD 2025)


Abstract
Neuro-symbolic programs, i.e. programs containing both machine learning components and traditional symbolic code, are becoming increasingly widespread. Finding a general methodology for verifying such programs is challenging due to both the number of different tools involved and the intricate interface between the "neural" and "symbolic" program components. In this paper we present a general decomposition of the neuro-symbolic verification problem into parts, and examine the problem of the embedding gap that occurs when one tries to combine proofs about the neural and symbolic components. To address this problem we then introduce Vehicle - standing as an abbreviation for a "verification condition language" - an intermediate programming language interface between machine learning frameworks, automated theorem provers, and dependently-typed formalisations of neuro-symbolic programs. Vehicle allows users to specify the properties of the neural components of neuro-symbolic programs once, and then safely compile the specification to each interface using a tailored typing and compilation procedure. We give a high-level overview of Vehicle’s overall design, its interfaces and compilation & type-checking procedures, and then demonstrate its utility by formally verifying the safety of a simple autonomous car controlled by a neural network, operating in a stochastic environment with imperfect information.

Cite as

Matthew L. Daggitt, Wen Kokke, Robert Atkey, Ekaterina Komendantskaya, Natalia Slusarz, and Luca Arnaboldi. Vehicle: Bridging the Embedding Gap in the Verification of Neuro-Symbolic Programs (Invited Talk). In 10th International Conference on Formal Structures for Computation and Deduction (FSCD 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 337, pp. 2:1-2:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{daggitt_et_al:LIPIcs.FSCD.2025.2,
  author =	{Daggitt, Matthew L. and Kokke, Wen and Atkey, Robert and Komendantskaya, Ekaterina and Slusarz, Natalia and Arnaboldi, Luca},
  title =	{{Vehicle: Bridging the Embedding Gap in the Verification of Neuro-Symbolic Programs}},
  booktitle =	{10th International Conference on Formal Structures for Computation and Deduction (FSCD 2025)},
  pages =	{2:1--2:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-374-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{337},
  editor =	{Fern\'{a}ndez, Maribel},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSCD.2025.2},
  URN =		{urn:nbn:de:0030-drops-236172},
  doi =		{10.4230/LIPIcs.FSCD.2025.2},
  annote =	{Keywords: Neural Network Verification, Types, Interactive Theorem Provers}
}
Document
Survey
Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}
Document
Resource
KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems

Authors: Nora Abdelmageed, Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, and Birgitta König-Ries

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Tabular data, often in the form of CSV files, plays a pivotal role in data analytics pipelines. Understanding this data semantically, known as Semantic Table Interpretation (STI), is crucial but poses challenges due to several factors such as the ambiguity of labels. As a result, STI has gained increasing attention from the community in the past few years. Evaluating STI systems requires well-established benchmarks. Most of the existing large-scale benchmarks are derived from general domain sources and focus on ambiguity, while domain-specific benchmarks are relatively small in size. This paper introduces KG2Tables, a framework that can construct domain-specific large-scale benchmarks from a Knowledge Graph (KG). KG2Tables leverages the internal hierarchy of the relevant KG concepts and their properties. As a proof of concept, we have built large datasets in the food, biodiversity, and biomedical domains. The resulting datasets, tFood, tBiomed, and tBiodiv, have been made available for the public in the ISWC SemTab challenge (2023 and 2024 editions). We include the evaluation results of top-performing STI systems using tFood Such results underscore its potential as a robust evaluation benchmark for challenging STI systems. We demonstrate the data quality level using a sample-based approach for the generated benchmarks including, for example, realistic tables assessment. Nevertheless, we provide an extensive discussion of KG2Tables explaining how it could be used to create other benchmarks from any domain of interest and including its key features and limitations with suggestions to overcome them.

Cite as

Nora Abdelmageed, Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, and Birgitta König-Ries. KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 1:1-1:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{abdelmageed_et_al:TGDK.3.1.1,
  author =	{Abdelmageed, Nora and Jim\'{e}nez-Ruiz, Ernesto and Hassanzadeh, Oktie and K\"{o}nig-Ries, Birgitta},
  title =	{{KG2Tables: A Domain-Specific Tabular Data Generator to Evaluate Semantic Table Interpretation Systems}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:28},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.1},
  URN =		{urn:nbn:de:0030-drops-230104},
  doi =		{10.4230/TGDK.3.1.1},
  annote =	{Keywords: Semantic Table Interpretation (STI), Knowledge Graph (KG), STI Benchmark, Food, Biodiversity, Biomedical}
}
Document
Position
Grounding Stream Reasoning Research

Authors: Pieter Bonte, Jean-Paul Calbimonte, Daniel de Leng, Daniele Dell'Aglio, Emanuele Della Valle, Thomas Eiter, Federico Giannini, Fredrik Heintz, Konstantin Schekotihin, Danh Le-Phuoc, Alessandra Mileo, Patrik Schneider, Riccardo Tommasini, Jacopo Urbani, and Giacomo Ziffer

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
In the last decade, there has been a growing interest in applying AI technologies to implement complex data analytics over data streams. To this end, researchers in various fields have been organising a yearly event called the "Stream Reasoning Workshop" to share perspectives, challenges, and experiences around this topic. In this paper, the previous organisers of the workshops and other community members provide a summary of the main research results that have been discussed during the first six editions of the event. These results can be categorised into four main research areas: The first is concerned with the technological challenges related to handling large data streams. The second area aims at adapting and extending existing semantic technologies to data streams. The third and fourth areas focus on how to implement reasoning techniques, either considering deductive or inductive techniques, to extract new and valuable knowledge from the data in the stream. This summary is written not only to provide a crystallisation of the field, but also to point out distinctive traits of the stream reasoning community. Moreover, it also provides a foundation for future research by enumerating a list of use cases and open challenges, to stimulate others to join this exciting research area.

Cite as

Pieter Bonte, Jean-Paul Calbimonte, Daniel de Leng, Daniele Dell'Aglio, Emanuele Della Valle, Thomas Eiter, Federico Giannini, Fredrik Heintz, Konstantin Schekotihin, Danh Le-Phuoc, Alessandra Mileo, Patrik Schneider, Riccardo Tommasini, Jacopo Urbani, and Giacomo Ziffer. Grounding Stream Reasoning Research. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 2:1-2:47, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{bonte_et_al:TGDK.2.1.2,
  author =	{Bonte, Pieter and Calbimonte, Jean-Paul and de Leng, Daniel and Dell'Aglio, Daniele and Della Valle, Emanuele and Eiter, Thomas and Giannini, Federico and Heintz, Fredrik and Schekotihin, Konstantin and Le-Phuoc, Danh and Mileo, Alessandra and Schneider, Patrik and Tommasini, Riccardo and Urbani, Jacopo and Ziffer, Giacomo},
  title =	{{Grounding Stream Reasoning Research}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:47},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.2},
  URN =		{urn:nbn:de:0030-drops-198597},
  doi =		{10.4230/TGDK.2.1.2},
  annote =	{Keywords: Stream Reasoning, Stream Processing, RDF streams, Streaming Linked Data, Continuous query processing, Temporal Logics, High-performance computing, Databases}
}
Document
Survey
Logics for Conceptual Data Modelling: A Review

Authors: Pablo R. Fillottrani and C. Maria Keet

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
Information modelling for databases and object-oriented information systems avails of conceptual data modelling languages such as EER and UML Class Diagrams. Many attempts exist to add logical rigour to them, for various reasons and with disparate strengths. In this paper we aim to provide a structured overview of the many efforts. We focus on aims, approaches to the formalisation, including key dimensions of choice points, popular logics used, and the main relevant reasoning services. We close with current challenges and research directions.

Cite as

Pablo R. Fillottrani and C. Maria Keet. Logics for Conceptual Data Modelling: A Review. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 4:1-4:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{fillottrani_et_al:TGDK.2.1.4,
  author =	{Fillottrani, Pablo R. and Keet, C. Maria},
  title =	{{Logics for Conceptual Data Modelling: A Review}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:30},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.4},
  URN =		{urn:nbn:de:0030-drops-198616},
  doi =		{10.4230/TGDK.2.1.4},
  annote =	{Keywords: Conceptual Data Modelling, EER, UML, Description Logics, OWL}
}
Document
Position
Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities

Authors: Jiaoyan Chen, Hang Dong, Janna Hastings, Ernesto Jiménez-Ruiz, Vanessa López, Pierre Monnin, Catia Pesquita, Petr Škoda, and Valentina Tamma

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
The term life sciences refers to the disciplines that study living organisms and life processes, and include chemistry, biology, medicine, and a range of other related disciplines. Research efforts in life sciences are heavily data-driven, as they produce and consume vast amounts of scientific data, much of which is intrinsically relational and graph-structured. The volume of data and the complexity of scientific concepts and relations referred to therein promote the application of advanced knowledge-driven technologies for managing and interpreting data, with the ultimate aim to advance scientific discovery. In this survey and position paper, we discuss recent developments and advances in the use of graph-based technologies in life sciences and set out a vision for how these technologies will impact these fields into the future. We focus on three broad topics: the construction and management of Knowledge Graphs (KGs), the use of KGs and associated technologies in the discovery of new knowledge, and the use of KGs in artificial intelligence applications to support explanations (explainable AI). We select a few exemplary use cases for each topic, discuss the challenges and open research questions within these topics, and conclude with a perspective and outlook that summarizes the overarching challenges and their potential solutions as a guide for future research.

Cite as

Jiaoyan Chen, Hang Dong, Janna Hastings, Ernesto Jiménez-Ruiz, Vanessa López, Pierre Monnin, Catia Pesquita, Petr Škoda, and Valentina Tamma. Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 5:1-5:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{chen_et_al:TGDK.1.1.5,
  author =	{Chen, Jiaoyan and Dong, Hang and Hastings, Janna and Jim\'{e}nez-Ruiz, Ernesto and L\'{o}pez, Vanessa and Monnin, Pierre and Pesquita, Catia and \v{S}koda, Petr and Tamma, Valentina},
  title =	{{Knowledge Graphs for the Life Sciences: Recent Developments, Challenges and Opportunities}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:33},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.5},
  URN =		{urn:nbn:de:0030-drops-194791},
  doi =		{10.4230/TGDK.1.1.5},
  annote =	{Keywords: Knowledge graphs, Life science, Knowledge discovery, Explainable AI}
}
Document
Position
Large Language Models and Knowledge Graphs: Opportunities and Challenges

Authors: Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Large Language Models (LLMs) have taken Knowledge Representation - and the world - by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.

Cite as

Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux. Large Language Models and Knowledge Graphs: Opportunities and Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 2:1-2:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{pan_et_al:TGDK.1.1.2,
  author =	{Pan, Jeff Z. and Razniewski, Simon and Kalo, Jan-Christoph and Singhania, Sneha and Chen, Jiaoyan and Dietze, Stefan and Jabeen, Hajira and Omeliyanenko, Janna and Zhang, Wen and Lissandrini, Matteo and Biswas, Russa and de Melo, Gerard and Bonifati, Angela and Vakaj, Edlira and Dragoni, Mauro and Graux, Damien},
  title =	{{Large Language Models and Knowledge Graphs: Opportunities and Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:38},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.2},
  URN =		{urn:nbn:de:0030-drops-194766},
  doi =		{10.4230/TGDK.1.1.2},
  annote =	{Keywords: Large Language Models, Pre-trained Language Models, Knowledge Graphs, Ontology, Retrieval Augmented Language Models}
}
Document
Survey
How Does Knowledge Evolve in Open Knowledge Graphs?

Authors: Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell'Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Openly available, collaboratively edited Knowledge Graphs (KGs) are key platforms for the collective management of evolving knowledge. The present work aims t o provide an analysis of the obstacles related to investigating and processing specifically this central aspect of evolution in KGs. To this end, we discuss (i) the dimensions of evolution in KGs, (ii) the observability of evolution in existing, open, collaboratively constructed Knowledge Graphs over time, and (iii) possible metrics to analyse this evolution. We provide an overview of relevant state-of-the-art research, ranging from metrics developed for Knowledge Graphs specifically to potential methods from related fields such as network science. Additionally, we discuss technical approaches - and their current limitations - related to storing, analysing and processing large and evolving KGs in terms of handling typical KG downstream tasks.

Cite as

Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell'Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs. How Does Knowledge Evolve in Open Knowledge Graphs?. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 11:1-11:59, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{polleres_et_al:TGDK.1.1.11,
  author =	{Polleres, Axel and Pernisch, Romana and Bonifati, Angela and Dell'Aglio, Daniele and Dobriy, Daniil and Dumbrava, Stefania and Etcheverry, Lorena and Ferranti, Nicolas and Hose, Katja and Jim\'{e}nez-Ruiz, Ernesto and Lissandrini, Matteo and Scherp, Ansgar and Tommasini, Riccardo and Wachs, Johannes},
  title =	{{How Does Knowledge Evolve in Open Knowledge Graphs?}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{11:1--11:59},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.11},
  URN =		{urn:nbn:de:0030-drops-194855},
  doi =		{10.4230/TGDK.1.1.11},
  annote =	{Keywords: KG evolution, temporal KG, versioned KG, dynamic KG}
}
Document
Vision
Autonomy in the Age of Knowledge Graphs: Vision and Challenges

Authors: Jean-Paul Calbimonte, Andrei Ciortea, Timotheus Kampik, Simon Mayer, Terry R. Payne, Valentina Tamma, and Antoine Zimmermann

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
In this position paper, we propose that Knowledge Graphs (KGs) are one of the prime approaches to support the programming of autonomous software systems at the knowledge level. From this viewpoint, we survey how KGs can support different dimensions of autonomy in such systems: For example, the autonomy of systems with respect to their environment, or with respect to organisations; and we discuss related practical and research challenges. We emphasise that KGs need to be able to support systems of autonomous software agents that are themselves highly heterogeneous, which limits how these systems may use KGs. Furthermore, these heterogeneous software agents may populate highly dynamic environments, which implies that they require adaptive KGs. The scale of the envisioned systems - possibly stretching to the size of the Internet - highlights the maintainability of the underlying KGs that need to contain large-scale knowledge, which requires that KGs are maintained jointly by humans and machines. Furthermore, autonomous agents require procedural knowledge, and KGs should hence be explored more towards the provisioning of such knowledge to augment autonomous behaviour. Finally, we highlight the importance of modelling choices, including with respect to the selected abstraction level when modelling and with respect to the provisioning of more expressive constraint languages.

Cite as

Jean-Paul Calbimonte, Andrei Ciortea, Timotheus Kampik, Simon Mayer, Terry R. Payne, Valentina Tamma, and Antoine Zimmermann. Autonomy in the Age of Knowledge Graphs: Vision and Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 13:1-13:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{calbimonte_et_al:TGDK.1.1.13,
  author =	{Calbimonte, Jean-Paul and Ciortea, Andrei and Kampik, Timotheus and Mayer, Simon and Payne, Terry R. and Tamma, Valentina and Zimmermann, Antoine},
  title =	{{Autonomy in the Age of Knowledge Graphs: Vision and Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{13:1--13:22},
  ISSN =	{2942-7517},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.13},
  URN =		{urn:nbn:de:0030-drops-194872},
  doi =		{10.4230/TGDK.1.1.13},
  annote =	{Keywords: Knowledge graphs, Autonomous Systems}
}
Document
Survey
Knowledge Graph Embeddings: Open Challenges and Opportunities

Authors: Russa Biswas, Lucie-Aimée Kaffee, Michael Cochez, Stefania Dumbrava, Theis E. Jendal, Matteo Lissandrini, Vanessa Lopez, Eneldo Loza Mencía, Heiko Paulheim, Harald Sack, Edlira Kalemi Vakaj, and Gerard de Melo

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
While Knowledge Graphs (KGs) have long been used as valuable sources of structured knowledge, in recent years, KG embeddings have become a popular way of deriving numeric vector representations from them, for instance, to support knowledge graph completion and similarity search. This study surveys advances as well as open challenges and opportunities in this area. For instance, the most prominent embedding models focus primarily on structural information. However, there has been notable progress in incorporating further aspects, such as semantics, multi-modal, temporal, and multilingual features. Most embedding techniques are assessed using human-curated benchmark datasets for the task of link prediction, neglecting other important real-world KG applications. Many approaches assume a static knowledge graph and are unable to account for dynamic changes. Additionally, KG embeddings may encode data biases and lack interpretability. Overall, this study provides an overview of promising research avenues to learn improved KG embeddings that can address a more diverse range of use cases.

Cite as

Russa Biswas, Lucie-Aimée Kaffee, Michael Cochez, Stefania Dumbrava, Theis E. Jendal, Matteo Lissandrini, Vanessa Lopez, Eneldo Loza Mencía, Heiko Paulheim, Harald Sack, Edlira Kalemi Vakaj, and Gerard de Melo. Knowledge Graph Embeddings: Open Challenges and Opportunities. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 4:1-4:32, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{biswas_et_al:TGDK.1.1.4,
  author =	{Biswas, Russa and Kaffee, Lucie-Aim\'{e}e and Cochez, Michael and Dumbrava, Stefania and Jendal, Theis E. and Lissandrini, Matteo and Lopez, Vanessa and Menc{\'\i}a, Eneldo Loza and Paulheim, Heiko and Sack, Harald and Vakaj, Edlira Kalemi and de Melo, Gerard},
  title =	{{Knowledge Graph Embeddings: Open Challenges and Opportunities}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:32},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.4},
  URN =		{urn:nbn:de:0030-drops-194783},
  doi =		{10.4230/TGDK.1.1.4},
  annote =	{Keywords: Knowledge Graphs, KG embeddings, Link prediction, KG applications}
}
Document
Vision
Knowledge Engineering Using Large Language Models

Authors: Bradley P. Allen, Lise Stork, and Paul Groth

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Knowledge engineering is a discipline that focuses on the creation and maintenance of processes that generate and apply knowledge. Traditionally, knowledge engineering approaches have focused on knowledge expressed in formal languages. The emergence of large language models and their capabilities to effectively work with natural language, in its broadest sense, raises questions about the foundations and practice of knowledge engineering. Here, we outline the potential role of LLMs in knowledge engineering, identifying two central directions: 1) creating hybrid neuro-symbolic knowledge systems; and 2) enabling knowledge engineering in natural language. Additionally, we formulate key open research questions to tackle these directions.

Cite as

Bradley P. Allen, Lise Stork, and Paul Groth. Knowledge Engineering Using Large Language Models. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{allen_et_al:TGDK.1.1.3,
  author =	{Allen, Bradley P. and Stork, Lise and Groth, Paul},
  title =	{{Knowledge Engineering Using Large Language Models}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:19},
  ISSN =	{2942-7517},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.3},
  URN =		{urn:nbn:de:0030-drops-194777},
  doi =		{10.4230/TGDK.1.1.3},
  annote =	{Keywords: knowledge engineering, large language models}
}
Document
Toward Linearizability Testing for Multi-Word Persistent Synchronization Primitives

Authors: Diego Cepeda, Sakib Chowdhury, Nan Li, Raphael Lopez, Xinzhe Wang, and Wojciech Golab

Published in: LIPIcs, Volume 153, 23rd International Conference on Principles of Distributed Systems (OPODIS 2019)


Abstract
Persistent memory makes it possible to recover in-memory data structures following a failure instead of rebuilding them from state saved in slow secondary storage. Implementing such recoverable data structures correctly is challenging as their underlying algorithms must deal with both parallelism and failures, which makes them especially susceptible to programming errors. Traditional proofs of correctness should therefore be combined with other methods, such as model checking or software testing, to minimize the likelihood of uncaught defects. This research focuses specifically on the algorithmic principles of software testing, particularly linearizability analysis, for multi-word persistent synchronization primitives such as conditional swap operations. We describe an efficient decision procedure for linearizability in this context, and discuss its practical applications in detecting previously-unknown bugs in implementations of multi-word persistent primitives.

Cite as

Diego Cepeda, Sakib Chowdhury, Nan Li, Raphael Lopez, Xinzhe Wang, and Wojciech Golab. Toward Linearizability Testing for Multi-Word Persistent Synchronization Primitives. In 23rd International Conference on Principles of Distributed Systems (OPODIS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 153, pp. 19:1-19:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{cepeda_et_al:LIPIcs.OPODIS.2019.19,
  author =	{Cepeda, Diego and Chowdhury, Sakib and Li, Nan and Lopez, Raphael and Wang, Xinzhe and Golab, Wojciech},
  title =	{{Toward Linearizability Testing for Multi-Word Persistent Synchronization Primitives}},
  booktitle =	{23rd International Conference on Principles of Distributed Systems (OPODIS 2019)},
  pages =	{19:1--19:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-133-7},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{153},
  editor =	{Felber, Pascal and Friedman, Roy and Gilbert, Seth and Miller, Avery},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2019.19},
  URN =		{urn:nbn:de:0030-drops-118050},
  doi =		{10.4230/LIPIcs.OPODIS.2019.19},
  annote =	{Keywords: Shared memory, persistent memory, synchronization, multi-word primitives, concurrency, correctness, software testing}
}
  • Refine by Type
  • 15 Document/PDF
  • 14 Document/HTML

  • Refine by Publication Year
  • 6 2025
  • 2 2024
  • 6 2023
  • 1 2020

  • Refine by Author
  • 3 Jiménez-Ruiz, Ernesto
  • 3 Lissandrini, Matteo
  • 2 Biswas, Russa
  • 2 Bonifati, Angela
  • 2 Calbimonte, Jean-Paul
  • Show More...

  • Refine by Series/Journal
  • 4 LIPIcs
  • 1 OASIcs
  • 10 TGDK

  • Refine by Classification
  • 4 Information systems → Graph-based database models
  • 3 Computing methodologies → Knowledge representation and reasoning
  • 2 Applied computing → Computational genomics
  • 2 Computing methodologies → Description logics
  • 2 Computing methodologies → Natural language processing
  • Show More...

  • Refine by Keyword
  • 2 Knowledge Graphs
  • 2 Knowledge graphs
  • 1 Autonomous Systems
  • 1 Biodiversity
  • 1 Biomedical
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail