DROPS

Document

Survey

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

Research

DOI: 10.4230/TGDK.3.2.3

GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.98

Fine-Grained Classification of Detecting Dominating Patterns

Authors: Jonathan Dransfeld, Marvin Künnemann, and Mirza Redzic

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

We consider the following generalization of dominating sets: Let G be a host graph and P be a pattern graph P. A dominating P-pattern in G is a subset S of vertices in G that (1) forms a dominating set in G and (2) induces a subgraph isomorphic to P. The graph theory literature studies the properties of dominating P-patterns for various patterns P, including cliques, matchings, independent sets, cycles and paths. Previous work (Kunnemann, Redzic 2024) obtains algorithms and conditional lower bounds for detecting dominating P-patterns particularly for P being a k-clique, a k-independent set and a k-matching. Their results give conditionally tight lower bounds if k is sufficiently large (where the bound depends the matrix multiplication exponent ω). We ask: Can we obtain a classification of the fine-grained complexity for all patterns P? Indeed, we define a graph parameter ρ(P) such that if ω = 2, then (n^ρ(P) m^{(|V(P)|-ρ(P))/2})^{1±o(1)} is the optimal running time assuming the Orthogonal Vectors Hypothesis, for all patterns P except the triangle K₃. Here, the host graph G has n vertices and m = Θ(n^α) edges, where 1 ≤ α ≤ 2. The parameter ρ(P) is closely related (but sometimes different) to a parameter δ(P) = max_{S ⊆ V(P)} |S|-|N(S)| studied in (Alon 1981) to tightly quantify the maximum number of occurrences of induced subgraphs isomorphic to P. Our results stand in contrast to the lack of a full fine-grained classification of detecting an arbitrary (not necessarily dominating) induced P-pattern.

Cite as

Jonathan Dransfeld, Marvin Künnemann, and Mirza Redzic. Fine-Grained Classification of Detecting Dominating Patterns. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 98:1-98:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{dransfeld_et_al:LIPIcs.ESA.2025.98,
  author =	{Dransfeld, Jonathan and K\"{u}nnemann, Marvin and Redzic, Mirza},
  title =	{{Fine-Grained Classification of Detecting Dominating Patterns}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{98:1--98:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.98},
  URN =		{urn:nbn:de:0030-drops-245679},
  doi =		{10.4230/LIPIcs.ESA.2025.98},
  annote =	{Keywords: fine-grained complexity theory, domination in graphs, subgraph isomorphism, classification theorem, parameterized algorithms}
}

Document

DOI: 10.4230/OASIcs.SpaceCHI.2025.6

Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support

Authors: Kaisheng Li and Richard S. Whittle

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)

Abstract

We propose a unified framework for an Earth‑independent AI system that provides explainable, context‑aware decision support for EVA mission planning by integrating six core components: a fine‑tuned EVA domain LLM, a retrieval‑augmented knowledge base, a short-term memory store, physical simulation models, an agentic orchestration layer, and a multimodal user interface. To ground our design, we analyze the current roles and substitution potential of the Mission Control Center - identifying which procedural and analytical functions can be automated onboard while preserving human oversight for experiential and strategic tasks. Building on this framework, we introduce RASAGE (Retrieval & Simulation Augmented Guidance Agent for Exploration), a proof‑of‑concept toolset that combines Microsoft Phi‑4‑mini‑instruct with a FAISS (Facebook AI Similarity Search)‑powered EVA knowledge base and custom A* path planning and hypogravity metabolic models to generate grounded, traceable EVA plans. We outline a staged validation strategy to evaluate improvements in route efficiency, metabolic prediction accuracy, anomaly response effectiveness, and crew trust under realistic communication delays. Our findings demonstrate the feasibility of replicating key Mission Control functions onboard, enhancing crew autonomy, reducing cognitive load, and improving safety for deep‑space exploration missions.

Cite as

Kaisheng Li and Richard S. Whittle. Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 6:1-6:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{li_et_al:OASIcs.SpaceCHI.2025.6,
  author =	{Li, Kaisheng and Whittle, Richard S.},
  title =	{{Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{6:1--6:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.6},
  URN =		{urn:nbn:de:0030-drops-239967},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.6},
  annote =	{Keywords: Human-AI Interaction for Space Exploration, Extravehicular Activities, Cognitive load and Human Performance Issues, Human Systems Exploration, Lunar Exploration, LLM}
}

@InProceedings{li_et_al:OASIcs.SpaceCHI.2025.6,
  author =	{Li, Kaisheng and Whittle, Richard S.},
  title =	{{Toward an Earth-Independent System for EVA Mission Planning: Integrating Physical Models, Domain Knowledge, and Agentic RAG to Provide Explainable LLM-Based Decision Support}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{6:1--6:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.6},
  URN =		{urn:nbn:de:0030-drops-239967},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.6},
  annote =	{Keywords: Human-AI Interaction for Space Exploration, Extravehicular Activities, Cognitive load and Human Performance Issues, Human Systems Exploration, Lunar Exploration, LLM}
}

Document

DOI: 10.4230/OASIcs.SpaceCHI.2025.31

Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization

Authors: Ze Gao, Yuan Zhuang, Kunqi Wang, and Mengyao Guo

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)

Abstract

As humanity advances toward long-term space habitation, traditional SHM systems - reliant on abstract data representations - struggle to support rapid decision-making in extreme environments. This study addresses this critical gap by introducing an engineering-art-human factors framework that transforms SHM through immersive data-art visualization. By integrating sensor networks and machine learning, structural data (stress, vibration, deformation) is converted into intuitive visual languages: dynamic color gradients and biomimetic morphologies leverage perceptual laws (e.g., Weber-Fechner) to amplify critical signals. Multimodal interfaces (AR, haptic feedback) and natural elements mitigate cognitive load and psychological stress in confined habitats. Our contribution lies in redefining SHM as a synergy of precision and intuition, enabling "at-a-glance" assessments while balancing functionality and human-centric design. The urgency of this research stems from the inadequacy of conventional systems in extreme space conditions and the growing demand for astronaut safety and operational efficiency. This framework not only pioneers a sustainable monitoring paradigm for space habitats but also extends to terrestrial high-risk infrastructure, demonstrating the necessity of interdisciplinary innovation in extreme environments.

Cite as

Ze Gao, Yuan Zhuang, Kunqi Wang, and Mengyao Guo. Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 31:1-31:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gao_et_al:OASIcs.SpaceCHI.2025.31,
  author =	{Gao, Ze and Zhuang, Yuan and Wang, Kunqi and Guo, Mengyao},
  title =	{{Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{31:1--31:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.31},
  URN =		{urn:nbn:de:0030-drops-240217},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.31},
  annote =	{Keywords: Structural health monitoring, space habitats, immersive visualization, human-centered design, interdisciplinary innovation}
}

@InProceedings{gao_et_al:OASIcs.SpaceCHI.2025.31,
  author =	{Gao, Ze and Zhuang, Yuan and Wang, Kunqi and Guo, Mengyao},
  title =	{{Monitoring the Structural Health of Space Habitats Through Immersive Data Art Visualization}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{31:1--31:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.31},
  URN =		{urn:nbn:de:0030-drops-240217},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.31},
  annote =	{Keywords: Structural health monitoring, space habitats, immersive visualization, human-centered design, interdisciplinary innovation}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2025.40

Sublinear Space Graph Algorithms in the Continual Release Model

Authors: Alessandro Epasto, Quanquan C. Liu, Tamalika Mukherjee, and Felix Zhou

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)

Abstract

The graph continual release model of differential privacy seeks to produce differentially private solutions to graph problems under a stream of edge updates where new private solutions are released after each update. Thus far, previously known edge-differentially private algorithms for most graph problems including densest subgraph and matchings in the continual release setting only output real-value estimates (not vertex subset solutions) and do not use sublinear space. Instead, they rely on computing exact graph statistics on the input [Hendrik Fichtenberger et al., 2021; Shuang Song et al., 2018]. In this paper, we leverage sparsification to address the above shortcomings for edge-insertion streams. Our edge-differentially private algorithms use sublinear space with respect to the number of edges in the graph while some also achieve sublinear space in the number of vertices in the graph. In addition, for the densest subgraph problem, we also output edge-differentially private vertex subset solutions; no previous graph algorithms in the continual release model output such subsets. We make novel use of assorted sparsification techniques from the non-private streaming and static graph algorithms literature to achieve new results in the sublinear space, continual release setting. This includes algorithms for densest subgraph, maximum matching, as well as the first continual release k-core decomposition algorithm. We also develop a novel sparse level data structure for k-core decomposition that may be of independent interest. To complement our insertion-only algorithms, we conclude with polynomial additive error lower bounds for edge-privacy in the fully dynamic setting, where only logarithmic lower bounds were previously known.

Cite as

Alessandro Epasto, Quanquan C. Liu, Tamalika Mukherjee, and Felix Zhou. Sublinear Space Graph Algorithms in the Continual Release Model. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 40:1-40:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{epasto_et_al:LIPIcs.APPROX/RANDOM.2025.40,
  author =	{Epasto, Alessandro and Liu, Quanquan C. and Mukherjee, Tamalika and Zhou, Felix},
  title =	{{Sublinear Space Graph Algorithms in the Continual Release Model}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{40:1--40:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.40},
  URN =		{urn:nbn:de:0030-drops-244064},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.40},
  annote =	{Keywords: Differential Privacy, Continual Release, Densest Subgraph, k-Core Decomposition, Maximum Matching}
}

@InProceedings{epasto_et_al:LIPIcs.APPROX/RANDOM.2025.40,
  author =	{Epasto, Alessandro and Liu, Quanquan C. and Mukherjee, Tamalika and Zhou, Felix},
  title =	{{Sublinear Space Graph Algorithms in the Continual Release Model}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{40:1--40:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.40},
  URN =		{urn:nbn:de:0030-drops-244064},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.40},
  annote =	{Keywords: Differential Privacy, Continual Release, Densest Subgraph, k-Core Decomposition, Maximum Matching}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2025.29

A Simplified Reduction for Error Correcting Matrix Multiplication Algorithms

Authors: Igor Shinkar and Harsimran Singh

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)

Abstract

We study the problem of transforming an algorithm for matrix multiplication, whose output has a small fraction of the entries correct into a matrix multiplication algorithm, whose output is fully correct for all inputs. In this work, we provide a new and simple way to transform an average-case algorithm that takes two matrices A,B ∈ 𝔽_p^{n×n} for a prime p, and outputs a matrix that agrees with the matrix product AB on a 1/p + ε fraction of entries on average for a small ε > 0, into a worst-case algorithm that correctly computes the matrix product for all possible inputs. Our reduction employs list-decodable codes to transform an average-case algorithm into an algorithm with one-sided error, which are known to admit efficient reductions from the work of Gola, Shinkar, and Singh [Gola et al., 2024]. Our reduction is more concise and straightforward compared to the recent work of Hirahara and Shimizu [Hirahara and Shimizu, 2025], and improves the overhead in the running time incurred during the reduction.

Cite as

Igor Shinkar and Harsimran Singh. A Simplified Reduction for Error Correcting Matrix Multiplication Algorithms. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 29:1-29:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{shinkar_et_al:LIPIcs.APPROX/RANDOM.2025.29,
  author =	{Shinkar, Igor and Singh, Harsimran},
  title =	{{A Simplified Reduction for Error Correcting Matrix Multiplication Algorithms}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{29:1--29:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.29},
  URN =		{urn:nbn:de:0030-drops-243953},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.29},
  annote =	{Keywords: Matrix Multiplication, Reductions, Worst case to average case reductions}
}

Document

DOI: 10.4230/LIPIcs.WADS.2025.30

Convolution and Knapsack in Higher Dimensions

Authors: Kilian Grage, Klaus Jansen, and Björn Schumacher

Published in: LIPIcs, Volume 349, 19th International Symposium on Algorithms and Data Structures (WADS 2025)

Abstract

In the Knapsack problem, one is given the task of packing a knapsack of a given size with items in order to gain a packing with a high profit value. As one of the most classical problems in computer science, research for this problem has gone a long way. One important connection to the (max,+)-convolution problem has been established, where knapsack solutions can be combined by building the convolution of two sequences. This observation has been used in recent years to give conditional lower bounds but also parameterized algorithms. In this paper we carry these results into higher dimensions. We consider Knapsack where items are characterized by multiple properties - given through a vector - and a knapsack that has a capacity vector. The packing must not exceed any of the given capacity constraints. In order to show a similar sub-quadratic lower bound we consider a multidimensional version of (max, +)-convolution. We then consider variants of this problem introduced by Cygan et al. and prove that they are all equivalent in terms of algorithms that allow for a running time sub-quadratic in the number of entries of the array. We further develop a parameterized algorithm to solve higher dimensional Knapsack. The techniques we apply are inspired by an algorithm introduced by Axiotis and Tzamos. We will show that even for higher dimensional Knapsack, we can reduce the problem to convolution on one-dimensional, concave sequences, leading to an 𝒪(dn + dD ⋅ max{(Π_{i=1}^d t_i), t_max log t_max}) algorithm, where D is the number of different weight vectors, t the capacity vector and d is the dimension of the problem. Then, we use the techniques to improve the approach of Eisenbrand and Weismantel to obtain an algorithm for Integer Linear Programming with upper bounds with running time 𝒪(dn) + D ⋅ 𝒪(d Δ)^{d(d+1)} + T_LP. Finally, we give an divide-and-conquer algorithm for ILP with running time n^{d+1} ⋅ O(Δ)^d ⋅ log(|u - 𝓁|_∞).

Cite as

Kilian Grage, Klaus Jansen, and Björn Schumacher. Convolution and Knapsack in Higher Dimensions. In 19th International Symposium on Algorithms and Data Structures (WADS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 349, pp. 30:1-30:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{grage_et_al:LIPIcs.WADS.2025.30,
  author =	{Grage, Kilian and Jansen, Klaus and Schumacher, Bj\"{o}rn},
  title =	{{Convolution and Knapsack in Higher Dimensions}},
  booktitle =	{19th International Symposium on Algorithms and Data Structures (WADS 2025)},
  pages =	{30:1--30:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-398-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{349},
  editor =	{Morin, Pat and Oh, Eunjin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WADS.2025.30},
  URN =		{urn:nbn:de:0030-drops-242618},
  doi =		{10.4230/LIPIcs.WADS.2025.30},
  annote =	{Keywords: Knapsack, Convolution, Integer Linear Programming}
}

Document

DOI: 10.4230/LIPIcs.GIScience.2025.1

Leveraging Open-Source Satellite-Derived Building Footprints for Height Inference

Authors: Clinton Stipek, Taylor Hauser, Justin Epting, Jessica Moehl, and Daniel Adams

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)

Abstract

At a global scale, cities are growing and characterizing the built environment is essential for deeper understanding of human population patterns, urban development, energy usage, climate change impacts, among others. Buildings are a key component of the built environment and significant progress has been made in recent years to scale building footprint extractions from satellite datum and other remotely sensed products. Billions of building footprints have recently been released by companies such as Microsoft and Google at a global scale. However, research has shown that depending on the methods leveraged to produce a footprint dataset, discrepancies can arise in both the number and shape of footprints produced. Therefore, each footprint dataset should be examined and used on a case-by-case study. In this work, we find through two experiments on Oak Ridge National Laboratory and Microsoft footprints within the same geographic extent that our approach of inferring height from footprint morphology features is source agnostic. Regardless of the differences associated with the methods used to produce a building footprint dataset, our approach of inferring height was able to overcome these discrepancies between the products and generalize, as evidenced by 98% of our results being within 3m of the ground-truthed height. This signifies that our approach can be applied to the billions of open-source footprints which are freely available to infer height, a key building metric. This work impacts the broader domain of urban science in which building height is a key, and limiting factor.

Cite as

Clinton Stipek, Taylor Hauser, Justin Epting, Jessica Moehl, and Daniel Adams. Leveraging Open-Source Satellite-Derived Building Footprints for Height Inference. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 1:1-1:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{stipek_et_al:LIPIcs.GIScience.2025.1,
  author =	{Stipek, Clinton and Hauser, Taylor and Epting, Justin and Moehl, Jessica and Adams, Daniel},
  title =	{{Leveraging Open-Source Satellite-Derived Building Footprints for Height Inference}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{1:1--1:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.1},
  URN =		{urn:nbn:de:0030-drops-238306},
  doi =		{10.4230/LIPIcs.GIScience.2025.1},
  annote =	{Keywords: Building Height, Big Data, Machine Learning}
}

Document

DOI: 10.4230/LIPIcs.GIScience.2025.12

Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing

Authors: Kalana Wijegunarathna, Kristin Stock, and Christopher B. Jones

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)

Abstract

Millions of biological sample records collected in the last few centuries archived in natural history collections are un-georeferenced. Georeferencing complex locality descriptions associated with these collection samples is a highly labour-intensive task collection agencies struggle with. None of the existing automated methods exploit maps that are an essential tool for georeferencing complex relations. We present preliminary experiments and results of a novel method that exploits multi-modal capabilities of recent Large Multi-Modal Models (LMM). This method enables the model to visually contextualize spatial relations it reads in the locality description. We use a grid-based approach to adapt these auto-regressive models for this task in a zero-shot setting. Our experiments conducted on a small manually annotated dataset show impressive results for our approach (∼1 km Average distance error) compared to uni-modal georeferencing with Large Language Models and existing georeferencing tools. The paper also discusses the findings of the experiments in light of an LMM’s ability to comprehend fine-grained maps. Motivated by these results, a practical framework is proposed to integrate this method into a georeferencing workflow.

Cite as

Kalana Wijegunarathna, Kristin Stock, and Christopher B. Jones. Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 12:1-12:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{wijegunarathna_et_al:LIPIcs.GIScience.2025.12,
  author =	{Wijegunarathna, Kalana and Stock, Kristin and Jones, Christopher B.},
  title =	{{Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{12:1--12:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.12},
  URN =		{urn:nbn:de:0030-drops-238412},
  doi =		{10.4230/LIPIcs.GIScience.2025.12},
  annote =	{Keywords: Large Multi-Modal Models, Large Language Models, LLM, Georeferencing, Natural History collections}
}

@InProceedings{wijegunarathna_et_al:LIPIcs.GIScience.2025.12,
  author =	{Wijegunarathna, Kalana and Stock, Kristin and Jones, Christopher B.},
  title =	{{Large Multi-Modal Model Cartographic Map Comprehension for Textual Locality Georeferencing}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{12:1--12:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.12},
  URN =		{urn:nbn:de:0030-drops-238412},
  doi =		{10.4230/LIPIcs.GIScience.2025.12},
  annote =	{Keywords: Large Multi-Modal Models, Large Language Models, LLM, Georeferencing, Natural History collections}
}

Document

DOI: 10.4230/LIPIcs.WABI.2025.10

DiVerG: Scalable Distance Index for Validation of Paired-End Alignments in Sequence Graphs

Authors: Ali Ghaffaari, Alexander Schönhuth, and Tobias Marschall

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)

Abstract

Determining the distance between two loci within a genomic region is a recurrent operation in various tasks in computational genomics. A notable example of this task arises in paired-end read mapping as a form of validation of distances between multiple alignments. While straightforward for a single genome, graph-based reference structures render the operation considerably more involved. Given the sheer number of such queries in a typical read mapping experiment, an efficient algorithm for answering distance queries is crucial. In this paper, we introduce DiVerG, a compact data structure as well as a fast and scalable algorithm, for constructing distance indexes for general sequence graphs on multi-core CPU and many-core GPU architectures. DiVerG is based on PairG [Jain et al., 2019], but overcomes the limitations of PairG by exploiting the extensive potential for improvements in terms of scalability and space efficiency. As a consequence, DiVerG can process substantially larger datasets, such as whole human genomes, which are unmanageable by PairG. DiVerG offers faster index construction time and consistently faster query time with gains proportional to the size of the underlying compact data structure. We demonstrate that our method performs favorably on multiple real datasets at various scales. DiVerG achieves superior performance over PairG; e.g. resulting to 2.5-4x speed-up in query time, 44-340x smaller index size, and 3-50x faster construction time for the genome graph of the MHC region, as a particularly variable region of the human genome. The implementation is available at: https://github.com/cartoonist/diverg

Cite as

Ali Ghaffaari, Alexander Schönhuth, and Tobias Marschall. DiVerG: Scalable Distance Index for Validation of Paired-End Alignments in Sequence Graphs. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 10:1-10:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ghaffaari_et_al:LIPIcs.WABI.2025.10,
  author =	{Ghaffaari, Ali and Sch\"{o}nhuth, Alexander and Marschall, Tobias},
  title =	{{DiVerG: Scalable Distance Index for Validation of Paired-End Alignments in Sequence Graphs}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{10:1--10:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.10},
  URN =		{urn:nbn:de:0030-drops-239369},
  doi =		{10.4230/LIPIcs.WABI.2025.10},
  annote =	{Keywords: Sequence graph, distance index, read mapping, sparse matrix}
}

Document

DOI: 10.4230/LIPIcs.WABI.2025.14

Human Readable Compression of GFA Paths Using Grammar-Based Code

Authors: Peter Heringer and Daniel Doerr

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)

Abstract

Pangenome graphs offer a compact and comprehensive representation of genomic diversity, improving tasks such as variant calling, genotyping, and other downstream analyses. Although the underlying graph structures scale sublinearly with the number of haplotypes, the widely used GFA file format suffers from rapidly growing file sizes due to the explicit and repetitive encoding of haplotype paths. In this work, we introduce an extension to the GFA format that enables efficient grammar-based compression of haplotype paths while retaining human readability. In addition, grammar-based encoding provides an efficient in-memory data structure that does not require decompression, but conversely improves the runtime of many computational tasks that involve haplotype comparisons. We present sqz, a method that makes use of the proposed format extension to encode haplotype paths using byte pair encoding, a grammar-based compression scheme. We evaluate sqz on recent human pangenome graphs from Heumos et al. and the Human Pangenome Reference Consortium (HPRC), comparing it to existing compressors bgzip, gbz, and sequitur. sqz scales sublinearly with the number of haplotypes in a pangenome graph and consistently achieves higher compression ratios than sequitur and up to 5 times better compression than bgzip in HPRC graphs and up to 10 times in the graph from Heumos et al.. When combined with bgzip, sqz matches or excels the compression ratio of gbz across all our datasets. These results demonstrate the potential of our proposed extension of the GFA format in reducing haplotype path redundancy and improving storage efficiency for pangenome graphs.

Cite as

Peter Heringer and Daniel Doerr. Human Readable Compression of GFA Paths Using Grammar-Based Code. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{heringer_et_al:LIPIcs.WABI.2025.14,
  author =	{Heringer, Peter and Doerr, Daniel},
  title =	{{Human Readable Compression of GFA Paths Using Grammar-Based Code}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{14:1--14:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.14},
  URN =		{urn:nbn:de:0030-drops-239395},
  doi =		{10.4230/LIPIcs.WABI.2025.14},
  annote =	{Keywords: pangenomics, pangenome graphs, compression, grammar-based code, byte pair encoding}
}

Document

Research

DOI: 10.4230/OASIcs.Grossi.1

An Efficient Heuristic for Graph Edit Distance

Authors: Xiaoyang Chen, Yujia Wang, Hongwei Huo, and Jeffrey Scott Vitter

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)

Abstract

The graph edit distance (GED) is a flexible distance measure widely used in many applications. Existing GED computation methods are usually based upon the tree-based search algorithm that explores all possible vertex (or edge) mappings between two compared graphs. During this process, various GED lower bounds are adopted as heuristic estimations to accelerate the tree-based search algorithm. For the first time, we analyze the relationship among three state-of-the-art GED lower bounds, label edit distance (LED), Hausdorff edit distance (HED), and branch edit distance (BED). Specifically, we demonstrate that BED(G, Q) ≥ HED(G, Q) and BED(G, Q) ≥ LED(G, Q) for any two undirected graphs G and Q. Furthermore, for BED we propose an efficient heuristic BED^+ for improving the tree-based search algorithm. Extensive experiments on real and synthetic datasets confirm that BED^+ achieves smaller deviation and larger solvable ratios than LED, HED and BED when they are employed as heuristic estimations. The source code is available online.

Cite as

Xiaoyang Chen, Yujia Wang, Hongwei Huo, and Jeffrey Scott Vitter. An Efficient Heuristic for Graph Edit Distance. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 1:1-1:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:OASIcs.Grossi.1,
  author =	{Chen, Xiaoyang and Wang, Yujia and Huo, Hongwei and Vitter, Jeffrey Scott},
  title =	{{An Efficient Heuristic for Graph Edit Distance}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{1:1--1:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.1},
  URN =		{urn:nbn:de:0030-drops-238004},
  doi =		{10.4230/OASIcs.Grossi.1},
  annote =	{Keywords: Graph edit distance, Label edit distance, Hausdorff edit distance, Branch edit distance, Tree-based search, Heuristics}
}

Document

DOI: 10.4230/LIPIcs.CP.2025.34

DynamicSAT: Dynamic Configuration Tuning for SAT Solving

Authors: Zhengyuan Shi, Wentao Jiang, Xindi Zhang, Jin Luo, Yun Liang, Zhufei Chu, and Qiang Xu

Published in: LIPIcs, Volume 340, 31st International Conference on Principles and Practice of Constraint Programming (CP 2025)

Abstract

Boolean Satisfiability (SAT) problem serves as a foundation for solving numerous real-world challenges. As problem complexity increases, so does the demand for sophisticated SAT solvers, which incorporate a variety of heuristics tailored to optimize performance for specific problem instances. However, a major limitation persists: a configuration that performs well on one instance may lead to inefficiencies on others. While previous approaches to automatic algorithm configuration set parameters prior to runtime, they fail to adapt to the dynamic evolution of problem characteristics during the solving process. We introduce DynamicSAT, a novel SAT solver framework that dynamically tunes configuration parameters during solving process. By adjusting parameters on-the-fly, DynamicSAT adapts to changes arising from clause learning, elimination, and other transformations, thus improving efficiency and robustness across diverse SAT instances. We demonstrate that DynamicSAT achieves significant performance gains over the state-of-the-art solver on 2024 SAT Competition Benchmark.

Cite as

Zhengyuan Shi, Wentao Jiang, Xindi Zhang, Jin Luo, Yun Liang, Zhufei Chu, and Qiang Xu. DynamicSAT: Dynamic Configuration Tuning for SAT Solving. In 31st International Conference on Principles and Practice of Constraint Programming (CP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 340, pp. 34:1-34:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{shi_et_al:LIPIcs.CP.2025.34,
  author =	{Shi, Zhengyuan and Jiang, Wentao and Zhang, Xindi and Luo, Jin and Liang, Yun and Chu, Zhufei and Xu, Qiang},
  title =	{{DynamicSAT: Dynamic Configuration Tuning for SAT Solving}},
  booktitle =	{31st International Conference on Principles and Practice of Constraint Programming (CP 2025)},
  pages =	{34:1--34:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-380-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{340},
  editor =	{de la Banda, Maria Garcia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2025.34},
  URN =		{urn:nbn:de:0030-drops-238952},
  doi =		{10.4230/LIPIcs.CP.2025.34},
  annote =	{Keywords: Boolean satisfiability problem, configuration tuning, multi-armed bandit}
}

Document

DOI: 10.4230/OASIcs.Manzini.4

Wavelet Tree, Part II: Text Indexing

Authors: Paolo Ferragina, Raffaele Giancarlo, Roberto Grossi, Giovanna Rosone, Rossano Venturini, and Jeffrey Scott Vitter

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)

Abstract

The Wavelet Tree data structure introduced in Grossi, Gupta, and Vitter [Grossi et al., 2003] is a space-efficient technique for rank and select queries that generalizes from binary symbols to an arbitrary multisymbol alphabet. Over the last two decades, it has become a pivotal tool in modern full-text indexing and data compression because of its properties and capabilities in compressing and indexing data, with many applications to information retrieval, genome analysis, data mining, and web search. In this paper, we survey the fascinating history and impact of Wavelet Trees; no doubt many more developments are yet to come. Our survey borrows some content from the authors' earlier works. This paper is divided into two parts: The first part gives a brief history of Wavelet Trees, including its varieties and practical implementations, which appears in the Festschrift dedicated to Roberto Grossi; the second part (this one) deals with Wavelet Tree-based text indexing and is included in the Festschrift dedicated to Giovanni Manzini.

Cite as

Paolo Ferragina, Raffaele Giancarlo, Roberto Grossi, Giovanna Rosone, Rossano Venturini, and Jeffrey Scott Vitter. Wavelet Tree, Part II: Text Indexing. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 4:1-4:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{ferragina_et_al:OASIcs.Manzini.4,
  author =	{Ferragina, Paolo and Giancarlo, Raffaele and Grossi, Roberto and Rosone, Giovanna and Venturini, Rossano and Vitter, Jeffrey Scott},
  title =	{{Wavelet Tree, Part II: Text Indexing}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{4:1--4:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.4},
  URN =		{urn:nbn:de:0030-drops-239127},
  doi =		{10.4230/OASIcs.Manzini.4},
  annote =	{Keywords: Wavelet tree, data compression, text indexing, compressed suffix array, Burrows-Wheeler transform, rank and select}
}

51 Search Results for "Xu, Pan"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message