16 Search Results for "Patterson, Daniel"


Document
Climate Change: What is Computing’s Responsibility? (Dagstuhl Perspectives Workshop 25122)

Authors: Bran Knowles, Vicki L. Hanson, Christoph Becker, Mike Berners-Lee, Andrew A. Chien, Benoit Combemale, Vlad Coroamă, Koen De Bosschere, Yi Ding, Adrian Friday, Boris Gamazaychikov, Lynda Hardman, Simon Hinterholzer, Mattias Höjer, Lynn Kaack, Lenneke Kuijer, Anne-Laure Ligozat, Jan Tobias Muehlberg, Yunmook Nah, Thomas Olsson, Anne-Cécile Orgerie, Daniel Pargman, Birgit Penzenstadler, Tom Romanoff, Emma Strubell, Colin Venters, and Junhua Zhao

Published in: Dagstuhl Manifestos, Volume 11, Issue 1 (2025)


Abstract
This Manifesto was produced from the Perspectives Workshop 25122 entitled "Climate Change: What is Computing’s Responsibility?" held March 16-19, 2025 at Schloss Dagstuhl, Germany. The Workshop provided a forum for world-leading computer scientists and expert consultants on environmental policy and sustainable transition to engage in a critical and urgent conversation about computing’s responsibilities in addressing climate change - or more aptly, climate crisis. The resulting Manifesto outlines commitments and directions for future action which, if adopted as a basis for more responsible computing practices, will help ensure that these technologies do not threaten the long-term habitability of the planet. We preface our Manifesto with a recognition that humanity is on a path that is not in agreement with international global warming targets and explore how computing technologies are currently hastening the overshoot of these boundaries. We critically assess the vaunted potential for harnessing computing technologies for the mitigation of global warming, agreeing that, under current circumstances, computing is contributing to negative environmental impacts in other sectors. Computing primarily improves efficiency and reduces costs which leads to more consumption and more negative environmental impact. Relying solely on efficiency gains in computing has thus far proven to be insufficient to curb global greenhouse gas emissions. Therefore, computing’s purpose within a strategy for tackling climate change must be reimagined. Our recommendations cover changes that need to be urgently made to the design priorities of computing technologies, but also speak to the more systemic shift in mindset, with sustainability and human rights providing a necessary moral foundation for developing the kinds of computing technologies most needed by society. We also stress the importance of digital policy that accounts for both the direct material impacts of computing and the detrimental indirect impacts arising from computing-enabled efficiencies, and the role of computing professionals in informing policy making.

Cite as

Bran Knowles, Vicki L. Hanson, Christoph Becker, Mike Berners-Lee, Andrew A. Chien, Benoit Combemale, Vlad Coroamă, Koen De Bosschere, Yi Ding, Adrian Friday, Boris Gamazaychikov, Lynda Hardman, Simon Hinterholzer, Mattias Höjer, Lynn Kaack, Lenneke Kuijer, Anne-Laure Ligozat, Jan Tobias Muehlberg, Yunmook Nah, Thomas Olsson, Anne-Cécile Orgerie, Daniel Pargman, Birgit Penzenstadler, Tom Romanoff, Emma Strubell, Colin Venters, and Junhua Zhao. Climate Change: What is Computing’s Responsibility? (Dagstuhl Perspectives Workshop 25122). In Dagstuhl Manifestos, Volume 11, Issue 1, pp. 1-18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{knowles_et_al:DagMan.11.1.1,
  author =	{Knowles, Bran and Hanson, Vicki L. and Becker, Christoph and Berners-Lee, Mike and Chien, Andrew A. and Combemale, Benoit and Coroam\u{a}, Vlad and De Bosschere, Koen and Ding, Yi and Friday, Adrian and Gamazaychikov, Boris and Hardman, Lynda and Hinterholzer, Simon and H\"{o}jer, Mattias and Kaack, Lynn and Kuijer, Lenneke and Ligozat, Anne-Laure and Muehlberg, Jan Tobias and Nah, Yunmook and Olsson, Thomas and Orgerie, Anne-C\'{e}cile and Pargman, Daniel and Penzenstadler, Birgit and Romanoff, Tom and Strubell, Emma and Venters, Colin and Zhao, Junhua},
  title =	{{Climate Change: What is Computing’s Responsibility? (Dagstuhl Perspectives Workshop 25122)}},
  pages =	{1--18},
  journal =	{Dagstuhl Manifestos},
  ISSN =	{2193-2433},
  year =	{2025},
  volume =	{11},
  number =	{1},
  editor =	{Knowles, Bran and Hanson, Vicki L. and Becker, Christoph and Berners-Lee, Mike and Chien, Andrew A. and Combemale, Benoit and Coroam\u{a}, Vlad and De Bosschere, Koen and Ding, Yi and Friday, Adrian and Gamazaychikov, Boris and Hardman, Lynda and Hinterholzer, Simon and H\"{o}jer, Mattias and Kaack, Lynn and Kuijer, Lenneke and Ligozat, Anne-Laure and Muehlberg, Jan Tobias and Nah, Yunmook and Olsson, Thomas and Orgerie, Anne-C\'{e}cile and Pargman, Daniel and Penzenstadler, Birgit and Romanoff, Tom and Strubell, Emma and Venters, Colin and Zhao, Junhua},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagMan.11.1.1},
  URN =		{urn:nbn:de:0030-drops-250724},
  doi =		{10.4230/DagMan.11.1.1},
  annote =	{Keywords: sustainability, climate change, efficiency, supply chain management, climate modelling}
}
Document
Extended Abstract
Toward a Typed Intermediate Language for R (Extended Abstract)

Authors: Mickaël Laurent, Jakob Hain, Filip Krikava, Sebastián Krynski, and Jan Vitek

Published in: OASIcs, Volume 134, Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)


Abstract
Compilers for dynamic languages often rely on intermediate representations with explicit type annotations to facilitate writing program transformations. This paper documents the design of a new typed intermediate representation for a just-in-time compiler for the R programming language called FIŘ. Type annotations, in FIŘ, capture properties such as sharing, the potential for effects, and compiler speculations. In this extended abstract, we focus on the sharing properties that may be used to optimize away some copies of values.

Cite as

Mickaël Laurent, Jakob Hain, Filip Krikava, Sebastián Krynski, and Jan Vitek. Toward a Typed Intermediate Language for R (Extended Abstract). In Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025). Open Access Series in Informatics (OASIcs), Volume 134, pp. 24:1-24:4, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{laurent_et_al:OASIcs.Programming.2025.24,
  author =	{Laurent, Micka\"{e}l and Hain, Jakob and Krikava, Filip and Krynski, Sebasti\'{a}n and Vitek, Jan},
  title =	{{Toward a Typed Intermediate Language for R}},
  booktitle =	{Companion Proceedings of the 9th International Conference on the Art, Science, and Engineering of Programming (Programming 2025)},
  pages =	{24:1--24:4},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-382-9},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{134},
  editor =	{Edwards, Jonathan and Perera, Roly and Petricek, Tomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Programming.2025.24},
  URN =		{urn:nbn:de:0030-drops-243086},
  doi =		{10.4230/OASIcs.Programming.2025.24},
  annote =	{Keywords: JIT, compilation, static typing, ownership, copy-on-write, dynamic language}
}
Document
On the I/O Complexity of the Cocke-Younger-Kasami Algorithm and of a Family of Related Dynamic Programming Algorithms

Authors: Lorenzo De Stefani and Vedant Gupta

Published in: LIPIcs, Volume 349, 19th International Symposium on Algorithms and Data Structures (WADS 2025)


Abstract
Asymptotically tight lower bounds are derived for the Input/Output (I/O) complexity of a class of dynamic programming algorithms, including matrix chain multiplication, optimal polygon triangulation, and the construction of optimal binary search trees. Assuming no recomputation of intermediate values, we establish an Ω(n³/(√M B)) I/O lower bound, where n denotes the size of the input and M denotes the size of the available fast memory (cache). When recomputation is allowed, we show that the same bound holds for M < cn, where c is a positive constant. In the case where M ≥ 2n, we show an Ω(n/B) I/O lower bound. We also discuss algorithms for which the number of executed I/O operations matches asymptotically each of the presented lower bounds, which are thus asymptotically tight. Additionally, we refine our general method to obtain a lower bound for the I/O complexity of the Cocke-Younger-Kasami algorithm, where the size of the grammar impacts the I/O complexity. An upper bound with asymptotically matching performance in many cases is also provided.

Cite as

Lorenzo De Stefani and Vedant Gupta. On the I/O Complexity of the Cocke-Younger-Kasami Algorithm and of a Family of Related Dynamic Programming Algorithms. In 19th International Symposium on Algorithms and Data Structures (WADS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 349, pp. 49:1-49:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{destefani_et_al:LIPIcs.WADS.2025.49,
  author =	{De Stefani, Lorenzo and Gupta, Vedant},
  title =	{{On the I/O Complexity of the Cocke-Younger-Kasami Algorithm and of a Family of Related Dynamic Programming Algorithms}},
  booktitle =	{19th International Symposium on Algorithms and Data Structures (WADS 2025)},
  pages =	{49:1--49:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-398-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{349},
  editor =	{Morin, Pat and Oh, Eunjin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WADS.2025.49},
  URN =		{urn:nbn:de:0030-drops-242800},
  doi =		{10.4230/LIPIcs.WADS.2025.49},
  annote =	{Keywords: I/O complexity, Dynamic Programming Algorithms, Lower Bounds, Recomputation, Cocke-Younger-Kasami}
}
Document
An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT

Authors: Ahsan Sanaullah, Degui Zhi, and Shaojie Zhang

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
String matching problems in bioinformatics are typically for finding exact substring matches between a query and a reference text. Previous formulations often focus on maximum exact matches (MEMs). However, multiple occurrences of substrings of the query in the text that are long enough but not maximal may not be captured by MEMs. Such long matches can be informative, especially when the text is a collection of similar sequences such as genomes. In this paper, we describe a new type of match between a pattern and a text that aren't necessarily maximal in the query, but still contain useful matching information: locally maximal exact matches (LEMs). There are usually a large amount of LEMs, so we only consider those above some length threshold ℒ. These are referred to as long LEMs. The purpose of long LEMs is to capture substring matches between a query and a text that are not necessarily maximal in the pattern but still long enough to be important. Therefore efficient long LEMs finding algorithms are desired for these datasets. However, these datasets are too large to query on traditional string indexes. Fortunately, these datasets are very repetitive. Recently, compressed string indexes that take advantage of the redundancy in the data but retain efficient querying capability have been proposed as a solution. We therefore give an efficient algorithm for computing all the long LEMs of a query and a text in a BWT runs compressed string index. We describe an O(m+occ) expected time algorithm that relies on an O(r) words space string index for outputting all long LEMs of a pattern with respect to a text given the matching statistics of the pattern with respect to the text. Here m is the length of the query, occ is the number of long LEMs outputted, and r is the number of runs in the BWT of the text. The O(r) space string index we describe relies on an adaptation of the move data structure by Nishimoto and Tabei. We are able to support LCP[i] queries in constant time given SA[i]. In other words, we answer PLCP[i] queries in constant time. These PLCP queries enable the efficient long LEM query. Long LEMs may provide useful similarity information between a pattern and a text that MEMs may ignore. This information is particularly useful in pangenome and biobank scale haplotype panel contexts.

Cite as

Ahsan Sanaullah, Degui Zhi, and Shaojie Zhang. An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 17:1-17:25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{sanaullah_et_al:LIPIcs.WABI.2025.17,
  author =	{Sanaullah, Ahsan and Zhi, Degui and Zhang, Shaojie},
  title =	{{An Efficient Data Structure and Algorithm for Long-Match Query in Run-Length Compressed BWT}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{17:1--17:25},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.17},
  URN =		{urn:nbn:de:0030-drops-239433},
  doi =		{10.4230/LIPIcs.WABI.2025.17},
  annote =	{Keywords: BWT, LEM, Long LEM, MEM, Run Length Compressed BWT, Move Data Structure, Pangenome}
}
Document
Haplotype-Aware Long-Read Error Correction

Authors: Parvesh Barak, Daniel Gibney, and Chirag Jain

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Error correction of long reads is an important initial step in genome assembly workflows. For organisms with ploidy greater than one, it is important to preserve haplotype-specific variation during read correction. This challenge has driven the development of several haplotype-aware correction methods. However, existing methods are based on either ad-hoc heuristics or deep learning approaches. In this paper, we introduce a rigorous formulation for this problem. Our approach builds on the minimum error correction framework used in reference-based haplotype phasing. We prove that the proposed formulation for error correction of reads in de novo context, i.e., without using a reference genome, is NP-hard. To make our exact algorithm scale to large datasets, we introduce practical heuristics. Experiments using PacBio HiFi sequencing datasets from human and plant genomes show that our approach achieves accuracy comparable to state-of-the-art methods. The software is freely available at https://github.com/at-cg/HALE.

Cite as

Parvesh Barak, Daniel Gibney, and Chirag Jain. Haplotype-Aware Long-Read Error Correction. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 4:1-4:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{barak_et_al:LIPIcs.WABI.2025.4,
  author =	{Barak, Parvesh and Gibney, Daniel and Jain, Chirag},
  title =	{{Haplotype-Aware Long-Read Error Correction}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{4:1--4:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.4},
  URN =		{urn:nbn:de:0030-drops-239300},
  doi =		{10.4230/LIPIcs.WABI.2025.4},
  annote =	{Keywords: Genome assembly, phasing, clustering, overlap graph, consensus}
}
Document
Dolphyin: A Combinatorial Algorithm for Identifying 1-Dollo Phylogenies in Cancer

Authors: Daniel W. Feng and Mohammed El-Kebir

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Several recent cancer phylogeny inference methods have used the k-Dollo evolutionary model for single-nucleotide variants. Specifically, in this problem one is given an m × n binary matrix B and seeks a rooted tree T with m leaves that correspond to the m rows of B, and each node of T is labeled by a binary state for each of the n characters subject to the restriction that each character is gained at most once (0-to-1 transition) and subsequently lost at most k times (1-to-0 transitions). The 1-Dollo variant, also known as the persistent perfect phylogeny where one is restricted to at most k = 1 losses per character, has been studied extensively, but its hardness remains an open question. Here, we prove that the 1-Dollo Linear Phylogeny (1DLP) problem, where we additionally require the resulting 1-Dollo phylogeny T to be linear, is equivalent to verifying whether the input matrix B adheres to the Consecutive Ones Property (C1P), which can be solved in polynomial time. Due to the equivalence, several known NP-hardness results for relevant variants of C1P carry over to 1DLP, including the minimization of false negatives (0-to-1 modifications to the input matrix B) or the allowance of 2 gains and 2 losses. We furthermore show how we can recursively decompose any, not necessarily linear, 1-Dollo phylogeny T into several 1-Dollo linear phylogenies, connected by matching branching points. We extend this characterization to matrices B that admit 1-Dollo phylogenies, giving necessary and sufficient conditions for the existence of a novel decomposition of B into several submatrices and corresponding branching points. This decomposition forms the basis of Dolphyin, a new exponential-time algorithm for inferring 1-Dollo phylogenies that efficiently leverages the determination of linear 1-Dollo phylogenies as a subroutine. Dolphyin can also be applied to input matrices B with false negatives. We demonstrate that Dolphyin is runtime-competitive with a previous integer linear programming based algorithm SPhyR on simulated datasets. We additionally analyze simulated datasets with false negative errors and find that in the median case, Dolphyin infers 1-Dollo phylogenies with inferred error rates at or below the ground truth rate. Finally, we apply Dolphyin to 99 acute myeloid leukemia single-cell sequencing datasets, finding that the majority of the cancers can be explained by 1-Dollo phylogenies with false negative error rates in line with the used sequencing technology. Availability. Dolphyin is available at: https://github.com/elkebir-group/Dolphyin.

Cite as

Daniel W. Feng and Mohammed El-Kebir. Dolphyin: A Combinatorial Algorithm for Identifying 1-Dollo Phylogenies in Cancer. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 9:1-9:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{feng_et_al:LIPIcs.WABI.2025.9,
  author =	{Feng, Daniel W. and El-Kebir, Mohammed},
  title =	{{Dolphyin: A Combinatorial Algorithm for Identifying 1-Dollo Phylogenies in Cancer}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{9:1--9:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.9},
  URN =		{urn:nbn:de:0030-drops-239356},
  doi =		{10.4230/LIPIcs.WABI.2025.9},
  annote =	{Keywords: Intra-tumor heterogeneity, persistent perfect phylogeny, consecutive ones property, combinatorics}
}
Document
IsaBIL: A Framework for Verifying (In)correctness of Binaries in Isabelle/HOL

Authors: Matt Griffin, Brijesh Dongol, and Azalea Raad

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
This paper presents IsaBIL, a binary analysis framework in Isabelle/HOL that is based on the widely used Binary Analysis Platform (BAP). Specifically, in IsaBIL, we formalise BAP’s intermediate language, called BIL and integrate it with Hoare logic (to enable proofs of correctness) as well as incorrectness logic (to enable proofs of incorrectness). IsaBIL inherits the full flexibility of BAP, allowing us to verify binaries for a wide range of languages (C, C++, Rust), toolchains (LLVM, Ghidra) and target architectures (x86, RISC-V), and can also be used when the source code for a binary is unavailable. To make verification tractable, we develop a number of big-step rules that combine BIL’s existing small-step rules at different levels of abstraction to support reuse. We develop high-level reasoning rules for RISC-V instructions (our main target architecture) to further optimise verification. Additionally, we develop Isabelle proof tactics that exploit common patterns in C binaries for RISC-V to discharge large numbers of proof goals (often in the 100s) automatically. IsaBIL includes an Isabelle/ML based parser for BIL programs, allowing one to automatically generate the associated Isabelle/HOL program locale from a BAP output. Taken together, IsaBIL provides a highly flexible proof environment for program binaries. As examples, we prove correctness of key examples from the Joint Strike Fighter coding standards and the MITRE database.

Cite as

Matt Griffin, Brijesh Dongol, and Azalea Raad. IsaBIL: A Framework for Verifying (In)correctness of Binaries in Isabelle/HOL. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 14:1-14:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{griffin_et_al:LIPIcs.ECOOP.2025.14,
  author =	{Griffin, Matt and Dongol, Brijesh and Raad, Azalea},
  title =	{{IsaBIL: A Framework for Verifying (In)correctness of Binaries in Isabelle/HOL}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{14:1--14:30},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.14},
  URN =		{urn:nbn:de:0030-drops-233070},
  doi =		{10.4230/LIPIcs.ECOOP.2025.14},
  annote =	{Keywords: Binary Analysis Platform, Isabelle/HOL, Hoare Logic, Incorrectness Logic}
}
Document
Survey
Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}
Document
Research
Talking Wikidata: Communication Patterns and Their Impact on Community Engagement in Collaborative Knowledge Graphs

Authors: Elisavet Koutsiana, Ioannis Reklos, Kholoud Saad Alghamdi, Nitisha Jain, Albert Meroño-Peñuela, and Elena Simperl

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
We study collaboration patterns of Wikidata, one of the world's largest open source collaborative knowledge graph (KG) communities. Collaborative KG communities, play a key role in structuring machine-readable knowledge to support AI systems like conversational agents. However, these communities face challenges related to long-term member engagement, as a small subset of contributors often is responsible for the majority of contributions and decision-making. While prior research has explored contributors' roles and lifespans, discussions within collaborative KG communities remain understudied. To fill this gap, we investigated the behavioural patterns of contributors and factors affecting their communication and participation. We analysed all the discussions on Wikidata using a mixed methods approach, including statistical tests, network analysis, and text and graph embedding representations. Our findings reveal that the interactions between Wikidata editors form a small world network, resilient to dropouts and inclusive, where both the network topology and discussion content influence the continuity of conversations. Furthermore, the account age of Wikidata members and their conversations are significant factors in their long-term engagement with the project. Our observations and recommendations can benefit the Wikidata and semantic web communities, providing guidance on how to improve collaborative environments for sustainability, growth, and quality.

Cite as

Elisavet Koutsiana, Ioannis Reklos, Kholoud Saad Alghamdi, Nitisha Jain, Albert Meroño-Peñuela, and Elena Simperl. Talking Wikidata: Communication Patterns and Their Impact on Community Engagement in Collaborative Knowledge Graphs. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 2:1-2:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{koutsiana_et_al:TGDK.3.1.2,
  author =	{Koutsiana, Elisavet and Reklos, Ioannis and Alghamdi, Kholoud Saad and Jain, Nitisha and Mero\~{n}o-Pe\~{n}uela, Albert and Simperl, Elena},
  title =	{{Talking Wikidata: Communication Patterns and Their Impact on Community Engagement in Collaborative Knowledge Graphs}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:27},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.2},
  URN =		{urn:nbn:de:0030-drops-230114},
  doi =		{10.4230/TGDK.3.1.2},
  annote =	{Keywords: collaborative knowledge graph, network analysis, graph embeddings, text embeddings}
}
Document
Survey
Semantic Web: Past, Present, and Future

Authors: Ansgar Scherp, Gerd Groener, Petr Škoda, Katja Hose, and Maria-Esther Vidal

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
Ever since the vision was formulated, the Semantic Web has inspired many generations of innovations. Semantic technologies have been used to share vast amounts of information on the Web, enhance them with semantics to give them meaning, and enable inference and reasoning on them. Throughout the years, semantic technologies, and in particular knowledge graphs, have been used in search engines, data integration, enterprise settings, and machine learning. In this paper, we recap the classical concepts and foundations of the Semantic Web as well as modern and recent concepts and applications, building upon these foundations. The classical topics we cover include knowledge representation, creating and validating knowledge on the Web, reasoning and linking, and distributed querying. We enhance this classical view of the so-called "Semantic Web Layer Cake" with an update of recent concepts that include provenance, security and trust, as well as a discussion of practical impacts from industry-led contributions. We conclude with an outlook on the future directions of the Semantic Web. This is a living document. If you like to contribute, please contact the first author and visit: https://github.com/ascherp/semantic-web-primer

Cite as

Ansgar Scherp, Gerd Groener, Petr Škoda, Katja Hose, and Maria-Esther Vidal. Semantic Web: Past, Present, and Future. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 3:1-3:37, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{scherp_et_al:TGDK.2.1.3,
  author =	{Scherp, Ansgar and Groener, Gerd and \v{S}koda, Petr and Hose, Katja and Vidal, Maria-Esther},
  title =	{{Semantic Web: Past, Present, and Future}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:37},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.3},
  URN =		{urn:nbn:de:0030-drops-198607},
  doi =		{10.4230/TGDK.2.1.3},
  annote =	{Keywords: Linked Open Data, Semantic Web Graphs, Knowledge Graphs}
}
Document
Position
Standardizing Knowledge Engineering Practices with a Reference Architecture

Authors: Bradley P. Allen and Filip Ilievski

Published in: TGDK, Volume 2, Issue 1 (2024): Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge, Volume 2, Issue 1


Abstract
Knowledge engineering is the process of creating and maintaining knowledge-producing systems. Throughout the history of computer science and AI, knowledge engineering workflows have been widely used given the importance of high-quality knowledge for reliable intelligent agents. Meanwhile, the scope of knowledge engineering, as apparent from its target tasks and use cases, has been shifting, together with its paradigms such as expert systems, semantic web, and language modeling. The intended use cases and supported user requirements between these paradigms have not been analyzed globally, as new paradigms often satisfy prior pain points while possibly introducing new ones. The recent abstraction of systemic patterns into a boxology provides an opening for aligning the requirements and use cases of knowledge engineering with the systems, components, and software that can satisfy them best, however, this direction has not been explored to date. This paper proposes a vision of harmonizing the best practices in the field of knowledge engineering by leveraging the software engineering methodology of creating reference architectures. We describe how a reference architecture can be iteratively designed and implemented to associate user needs with recurring systemic patterns, building on top of existing knowledge engineering workflows and boxologies. We provide a six-step roadmap that can enable the development of such an architecture, consisting of scope definition, selection of information sources, architectural analysis, synthesis of an architecture based on the information source analysis, evaluation through instantiation, and, ultimately, instantiation into a concrete software architecture. We provide an initial design and outcome of the definition of architectural scope, selection of information sources, and analysis. As the remaining steps of design, evaluation, and instantiation of the architecture are largely use-case specific, we provide a detailed description of their procedures and point to relevant examples. We expect that following through on this vision will lead to well-grounded reference architectures for knowledge engineering, will advance the ongoing initiatives of organizing the neurosymbolic knowledge engineering space, and will build new links to the software architectures and data science communities.

Cite as

Bradley P. Allen and Filip Ilievski. Standardizing Knowledge Engineering Practices with a Reference Architecture. In Special Issue on Trends in Graph Data and Knowledge - Part 2. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 1, pp. 5:1-5:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Article{allen_et_al:TGDK.2.1.5,
  author =	{Allen, Bradley P. and Ilievski, Filip},
  title =	{{Standardizing Knowledge Engineering Practices with a Reference Architecture}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{5:1--5:23},
  ISSN =	{2942-7517},
  year =	{2024},
  volume =	{2},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.2.1.5},
  URN =		{urn:nbn:de:0030-drops-198623},
  doi =		{10.4230/TGDK.2.1.5},
  annote =	{Keywords: knowledge engineering, knowledge graphs, quality attributes, software architectures, sociotechnical systems}
}
Document
Position
Large Language Models and Knowledge Graphs: Opportunities and Challenges

Authors: Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Large Language Models (LLMs) have taken Knowledge Representation - and the world - by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.

Cite as

Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux. Large Language Models and Knowledge Graphs: Opportunities and Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 2:1-2:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{pan_et_al:TGDK.1.1.2,
  author =	{Pan, Jeff Z. and Razniewski, Simon and Kalo, Jan-Christoph and Singhania, Sneha and Chen, Jiaoyan and Dietze, Stefan and Jabeen, Hajira and Omeliyanenko, Janna and Zhang, Wen and Lissandrini, Matteo and Biswas, Russa and de Melo, Gerard and Bonifati, Angela and Vakaj, Edlira and Dragoni, Mauro and Graux, Damien},
  title =	{{Large Language Models and Knowledge Graphs: Opportunities and Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{2:1--2:38},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.2},
  URN =		{urn:nbn:de:0030-drops-194766},
  doi =		{10.4230/TGDK.1.1.2},
  annote =	{Keywords: Large Language Models, Pre-trained Language Models, Knowledge Graphs, Ontology, Retrieval Augmented Language Models}
}
Document
Survey
How Does Knowledge Evolve in Open Knowledge Graphs?

Authors: Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell'Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1


Abstract
Openly available, collaboratively edited Knowledge Graphs (KGs) are key platforms for the collective management of evolving knowledge. The present work aims t o provide an analysis of the obstacles related to investigating and processing specifically this central aspect of evolution in KGs. To this end, we discuss (i) the dimensions of evolution in KGs, (ii) the observability of evolution in existing, open, collaboratively constructed Knowledge Graphs over time, and (iii) possible metrics to analyse this evolution. We provide an overview of relevant state-of-the-art research, ranging from metrics developed for Knowledge Graphs specifically to potential methods from related fields such as network science. Additionally, we discuss technical approaches - and their current limitations - related to storing, analysing and processing large and evolving KGs in terms of handling typical KG downstream tasks.

Cite as

Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell'Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs. How Does Knowledge Evolve in Open Knowledge Graphs?. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 11:1-11:59, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@Article{polleres_et_al:TGDK.1.1.11,
  author =	{Polleres, Axel and Pernisch, Romana and Bonifati, Angela and Dell'Aglio, Daniele and Dobriy, Daniil and Dumbrava, Stefania and Etcheverry, Lorena and Ferranti, Nicolas and Hose, Katja and Jim\'{e}nez-Ruiz, Ernesto and Lissandrini, Matteo and Scherp, Ansgar and Tommasini, Riccardo and Wachs, Johannes},
  title =	{{How Does Knowledge Evolve in Open Knowledge Graphs?}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{11:1--11:59},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.11},
  URN =		{urn:nbn:de:0030-drops-194855},
  doi =		{10.4230/TGDK.1.1.11},
  annote =	{Keywords: KG evolution, temporal KG, versioned KG, dynamic KG}
}
Document
Linking Types for Multi-Language Software: Have Your Cake and Eat It Too

Authors: Daniel Patterson and Amal Ahmed

Published in: LIPIcs, Volume 71, 2nd Summit on Advances in Programming Languages (SNAPL 2017)


Abstract
Large software systems are and should be implemented with many different languages, each suited to the domain of the task at hand. High-level business logic may be written in Java or OCaml, resource-intensive components may be written in C or Rust, and high-assurance components may be written in Coq. In some development shops, domain-specific languages are used in various parts of systems to better separate the logic of particular problems from the plumbing of general-purpose programming. But how are programmers to reason about such multi-language systems? Currently, for a programmer to reason about a single source component within this multi-language system, it is not sufficient for her to consider how her component behaves in source-level contexts. Instead, she is required to understand the target contexts that her component will be run in after compilation - which requires not only understanding aspects of the compiler, but also how target components are linked together. These target contexts may have behavior inexpressible in the source, which can impact the notion of equivalence that justifies behavior-preserving modifications of code, whether programmer refactorings or compiler optimizations. But while programmers should not have to reason about arbitrary target contexts, sometimes multi-language linking is done exactly to gain access to features unavailable in the source. To enable programmers to reason about components that link with behavior inexpressible in their language, we advocate that language designers incorporate specifications for linking into the source language. Such specifications should allow a programmer to reason about inputs from other languages in a way that remains close to the semantics of her language. Linking types are a well-specified minimal extension of a source language that allow programmers to annotate where in their programs they can link with components that are not expressible in their unadulterated source language. This gives them fine-grained control over the contexts that they must reason about and the equivalences that arise.

Cite as

Daniel Patterson and Amal Ahmed. Linking Types for Multi-Language Software: Have Your Cake and Eat It Too. In 2nd Summit on Advances in Programming Languages (SNAPL 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 71, pp. 12:1-12:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{patterson_et_al:LIPIcs.SNAPL.2017.12,
  author =	{Patterson, Daniel and Ahmed, Amal},
  title =	{{Linking Types for Multi-Language Software: Have Your Cake and Eat It Too}},
  booktitle =	{2nd Summit on Advances in Programming Languages (SNAPL 2017)},
  pages =	{12:1--12:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-032-3},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{71},
  editor =	{Lerner, Benjamin S. and Bod{\'\i}k, Rastislav and Krishnamurthi, Shriram},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SNAPL.2017.12},
  URN =		{urn:nbn:de:0030-drops-71250},
  doi =		{10.4230/LIPIcs.SNAPL.2017.12},
  annote =	{Keywords: Linking, program reasoning, equivalence, expressive power of languages, fully abstract compilation}
}
Document
Characterizing Data Dependence Constraints for Dynamic Reliability Using n-Queens Attack Domains

Authors: Eric W. D. Rozier, Kristin Y. Rozier, and Ulya Bayram

Published in: LITES, Volume 4, Issue 1 (2017). Leibniz Transactions on Embedded Systems, Volume 4, Issue 1


Abstract
As data centers attempt to cope with the exponential growth of data, new techniques for intelligent, software-defined data centers (SDDC) are being developed to confront the scale and pace of changing resources and requirements.  For cost-constrained environments, like those increasingly present in scientific research labs, SDDCs also may provide better reliability and performability with no additional hardware through the use of dynamic syndrome allocation. To do so, the middleware layers of SDDCs must be able to calculate and account for complex dependence relationships to determine an optimal data layout.  This challenge is exacerbated by the growth of constraints on the dependence problem when available resources are both large (due to a higher number of syndromes that can be stored) and small (due to the lack of available space for syndrome allocation). We present a quantitative method for characterizing these challenges using an analysis of attack domains for high-dimension variants of the $n$-queens problem that enables performable solutions via the SMT solver Z3. We demonstrate correctness of our technique, and provide experimental evidence of its efficacy; our implementation is publicly available.

Cite as

Eric W. D. Rozier, Kristin Y. Rozier, and Ulya Bayram. Characterizing Data Dependence Constraints for Dynamic Reliability Using n-Queens Attack Domains. In LITES, Volume 4, Issue 1 (2017). Leibniz Transactions on Embedded Systems, Volume 4, Issue 1, pp. 05:1-05:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@Article{rozier_et_al:LITES-v004-i001-a005,
  author =	{Rozier, Eric W. D. and Rozier, Kristin Y. and Bayram, Ulya},
  title =	{{Characterizing Data Dependence Constraints for Dynamic Reliability Using n-Queens Attack Domains}},
  journal =	{Leibniz Transactions on Embedded Systems},
  pages =	{05:1--05:26},
  ISSN =	{2199-2002},
  year =	{2017},
  volume =	{4},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LITES-v004-i001-a005},
  URN =		{urn:nbn:de:0030-drops-192667},
  doi =		{10.4230/LITES-v004-i001-a005},
  annote =	{Keywords: SMT, Data dependence, n-queens}
}
  • Refine by Type
  • 16 Document/PDF
  • 12 Document/HTML

  • Refine by Publication Year
  • 9 2025
  • 2 2024
  • 2 2023
  • 2 2017
  • 1 2016

  • Refine by Author
  • 2 Bonifati, Angela
  • 2 Hose, Katja
  • 2 Lissandrini, Matteo
  • 2 Scherp, Ansgar
  • 1 Ahmed, Amal
  • Show More...

  • Refine by Series/Journal
  • 6 LIPIcs
  • 1 OASIcs
  • 2 LITES
  • 6 TGDK
  • 1 DagMan

  • Refine by Classification
  • 3 Computing methodologies → Knowledge representation and reasoning
  • 2 Applied computing → Computational genomics
  • 2 General and reference → Surveys and overviews
  • 2 Information systems → Graph-based database models
  • 1 Computer systems organization → Embedded and cyber-physical systems
  • Show More...

  • Refine by Keyword
  • 2 Knowledge Graphs
  • 1 BWT
  • 1 Binary Analysis Platform
  • 1 Cache analysis
  • 1 Cocke-Younger-Kasami
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail