DROPS

Document

DX Competition

DOI: 10.4230/OASIcs.DX.2025.15

Data-Driven Fault Detection and Isolation Enhanced with System Structural Relationships (DX Competition)

Authors: Austin Coursey, Abel Diaz-Gonzalez, Marcos Quinones-Grueiro, and Gautam Biswas

Published in: OASIcs, Volume 136, 36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025)

Abstract

Fault detection and isolation are becoming increasingly important as modern systems become more complex. To encourage the development of new fault detection solutions that can operate with limited noisy data and an incomplete mathematical model, the DX 2025 LiU-ICE competition for diagnosis of the air path of an internal combustion engine was introduced. In this paper, we present our winning solution to this competition. Our fault detection architecture starts with a semi-supervised Transformer Autoencoder trained to reconstruct nominal data. Detected faults are then passed through a rule-based fault persistence filter that aims to suppress false positives. Once a fault is detected, we use four neural networks trained to estimate features determined from structural analysis of a partial system model. The residuals of these networks are fed to a supervised fault classification network that estimates the fault probabilities. With this architecture, we achieved an 87% detection rate with a 0% false alarm rate on the provided competition data. Additionally, our isolation architecture assigned the correct fault 73.8% probabilty on average. On unseen competition data from a new driving cycle, we achieved a 100% detection rate and assigned the correct fault 66.2% probability on average. On the other hand, the Transformer Autoencoder failed to transfer to the new driving conditions, causing many false alarms. We discuss ways future work can reduce this.

Cite as

Austin Coursey, Abel Diaz-Gonzalez, Marcos Quinones-Grueiro, and Gautam Biswas. Data-Driven Fault Detection and Isolation Enhanced with System Structural Relationships (DX Competition). In 36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025). Open Access Series in Informatics (OASIcs), Volume 136, pp. 15:1-15:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{coursey_et_al:OASIcs.DX.2025.15,
  author =	{Coursey, Austin and Diaz-Gonzalez, Abel and Quinones-Grueiro, Marcos and Biswas, Gautam},
  title =	{{Data-Driven Fault Detection and Isolation Enhanced with System Structural Relationships}},
  booktitle =	{36th International Conference on Principles of Diagnosis and Resilient Systems (DX 2025)},
  pages =	{15:1--15:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-394-2},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{136},
  editor =	{Quinones-Grueiro, Marcos and Biswas, Gautam and Pill, Ingo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.DX.2025.15},
  URN =		{urn:nbn:de:0030-drops-248043},
  doi =		{10.4230/OASIcs.DX.2025.15},
  annote =	{Keywords: fault detection, fault isolation, autoencoder}
}

Document

Survey

DOI: 10.4230/TGDK.3.2.1

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}

Document

DOI: 10.4230/LIPIcs.AFT.2025.27

Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets

Authors: Oriol Saguillo, Vahid Ghafouri, Lucianna Kiffer, and Guillermo Suarez-Tangil

Published in: LIPIcs, Volume 354, 7th Conference on Advances in Financial Technologies (AFT 2025)

Abstract

Polymarket is a prediction market platform where users can speculate on future events by trading shares tied to specific outcomes, known as conditions. Each market on Polymarket is associated with a set of one or more such conditions. To ensure proper market resolution, the condition set must be exhaustive - collectively accounting for all possible outcomes - and mutually exclusive - only one condition may resolve as true. Thus, the collective prices (probabilities) of all related outcomes (whether in a condition or market) should be $1, representing a combined probability of 1 of any outcome. Despite this design, Polymarket exhibits cases where dependent assets are mispriced, allowing for purchasing (or selling) a certain outcome for less than (or more than) $1, guaranteeing profit. This phenomenon, known as arbitrage, could enable sophisticated participants to exploit such inconsistencies. In this paper, we conduct an empirical arbitrage analysis on Polymarket data to answer three key questions: (Q1) What conditions give rise to arbitrage? (Q2) Does arbitrage actually occur on Polymarket?, and (Q3) Has anyone exploited these opportunities? A major challenge in analyzing arbitrage between related markets lies in the scalability of comparisons across a large number of markets and conditions, with a naive analysis requiring O(2^{n+m}) comparisons. To overcome this, we employ a heuristic-driven reduction strategy based on timeliness, topical similarity, and combinatorial relationships, further validated by expert input. Our study reveals two distinct forms of arbitrage on Polymarket: Market Rebalancing Arbitrage, which occurs within a single market or condition (intra-market), and Combinatorial Arbitrage, which spans across multiple markets (inter-market). We use on-chain historical order book data to analyze when these types of arbitrage opportunities have existed, and when they have been executed by users. We find a realized estimate of 40 million USD of profit extracted across both types of arbitrage during our measurement period.

Cite as

Oriol Saguillo, Vahid Ghafouri, Lucianna Kiffer, and Guillermo Suarez-Tangil. Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets. In 7th Conference on Advances in Financial Technologies (AFT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 354, pp. 27:1-27:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{saguillo_et_al:LIPIcs.AFT.2025.27,
  author =	{Saguillo, Oriol and Ghafouri, Vahid and Kiffer, Lucianna and Suarez-Tangil, Guillermo},
  title =	{{Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets}},
  booktitle =	{7th Conference on Advances in Financial Technologies (AFT 2025)},
  pages =	{27:1--27:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-400-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{354},
  editor =	{Avarikioti, Zeta and Christin, Nicolas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.AFT.2025.27},
  URN =		{urn:nbn:de:0030-drops-247468},
  doi =		{10.4230/LIPIcs.AFT.2025.27},
  annote =	{Keywords: Prediction Markets, Maximal Extractable Value, Large Language Models}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ESA.2025.2

Securing Dynamic Data: A Primer on Differentially Private Data Structures (Invited Talk)

Authors: Monika Henzinger and Roodabeh Safavi

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

We give an introduction into differential privacy in the dynamic setting, called the continual observation setting.

Cite as

Monika Henzinger and Roodabeh Safavi. Securing Dynamic Data: A Primer on Differentially Private Data Structures (Invited Talk). In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 2:1-2:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{henzinger_et_al:LIPIcs.ESA.2025.2,
  author =	{Henzinger, Monika and Safavi, Roodabeh},
  title =	{{Securing Dynamic Data: A Primer on Differentially Private Data Structures}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{2:1--2:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.2},
  URN =		{urn:nbn:de:0030-drops-244702},
  doi =		{10.4230/LIPIcs.ESA.2025.2},
  annote =	{Keywords: Differential privacy, continual observation}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.69

Bandwidth vs BFS Width in Matrix Reordering, Graph Reconstruction, and Graph Drawing

Authors: David Eppstein, Michael T. Goodrich, and Songyu (Alfred) Liu

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

We provide the first approximation quality guarantees for the Cuthull-McKee heuristic for reordering symmetric matrices to have low bandwidth, and we provide an algorithm for reconstructing bounded-bandwidth graphs from distance oracles with near-linear query complexity. To prove these results we introduce a new width parameter, BFS width, and we prove polylogarithmic upper and lower bounds on the BFS width of graphs of bounded bandwidth. Unlike other width parameters, such as bandwidth, pathwidth, and treewidth, BFS width can easily be computed in polynomial time. Bounded BFS width implies bounded bandwidth, pathwidth, and treewidth, which in turn imply fixed-parameter tractable algorithms for many problems that are NP-hard for general graphs. In addition to their applications to matrix ordering, we also provide applications of BFS width to graph reconstruction, to reconstruct graphs from distance queries, and graph drawing, to construct arc diagrams of small height.

Cite as

David Eppstein, Michael T. Goodrich, and Songyu (Alfred) Liu. Bandwidth vs BFS Width in Matrix Reordering, Graph Reconstruction, and Graph Drawing. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 69:1-69:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{eppstein_et_al:LIPIcs.ESA.2025.69,
  author =	{Eppstein, David and Goodrich, Michael T. and Liu, Songyu (Alfred)},
  title =	{{Bandwidth vs BFS Width in Matrix Reordering, Graph Reconstruction, and Graph Drawing}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{69:1--69:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.69},
  URN =		{urn:nbn:de:0030-drops-245373},
  doi =		{10.4230/LIPIcs.ESA.2025.69},
  annote =	{Keywords: Graph algorithms, graph theory, graph width, bandwidth, treewidth}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.111

Hardness of Median and Center in the Ulam Metric

Authors: Nick Fischer, Elazar Goldenberg, Mursalin Habib, and Karthik C. S.

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

The classical rank aggregation problem seeks to combine a set X of n permutations into a single representative "consensus" permutation. In this paper, we investigate two fundamental rank aggregation tasks under the well-studied Ulam metric: computing a median permutation (which minimizes the sum of Ulam distances to X) and computing a center permutation (which minimizes the maximum Ulam distance to X) in two settings. - Continuous Setting: In the continuous setting, the median/center is allowed to be any permutation. It is known that computing a center in the Ulam metric is NP-hard and we add to this by showing that computing a median is NP-hard as well via a simple reduction from the Max-Cut problem. While this result may not be unexpected, it had remained elusive until now and confirms a speculation by Chakraborty, Das, and Krauthgamer [SODA '21]. - Discrete Setting: In the discrete setting, the median/center must be a permutation from the input set. We fully resolve the fine-grained complexity of the discrete median and discrete center problems under the Ulam metric, proving that the naive Õ(n² L)-time algorithm (where L is the length of the permutation) is conditionally optimal. This resolves an open problem raised by Abboud, Bateni, Cohen-Addad, Karthik C. S., and Seddighin [APPROX '23]. Our reductions are inspired by the known fine-grained lower bounds for similarity measures, but we face and overcome several new highly technical challenges.

Cite as

Nick Fischer, Elazar Goldenberg, Mursalin Habib, and Karthik C. S.. Hardness of Median and Center in the Ulam Metric. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 111:1-111:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{fischer_et_al:LIPIcs.ESA.2025.111,
  author =	{Fischer, Nick and Goldenberg, Elazar and Habib, Mursalin and Karthik C. S.},
  title =	{{Hardness of Median and Center in the Ulam Metric}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{111:1--111:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.111},
  URN =		{urn:nbn:de:0030-drops-245809},
  doi =		{10.4230/LIPIcs.ESA.2025.111},
  annote =	{Keywords: Ulam distance, median, center, rank aggregation, fine-grained complexity}
}

Document

RANDOM

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2025.30

Algorithmic Contiguity from Low-Degree Conjecture and Applications in Correlated Random Graphs

Authors: Zhangsong Li

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)

Abstract

In this paper, assuming a natural strengthening of the low-degree conjecture, we provide evidence of computational hardness for two problems: (1) the (partial) matching recovery problem in the sparse correlated Erdős-Rényi graphs G(n,q;ρ) when the edge-density q = n^{-1+o(1)} and the correlation ρ < √{α} lies below the Otter’s threshold, this resolves a remaining problem in [Jian Ding et al., 2023]; (2) the detection problem between a pair of correlated sparse stochastic block model S(n,λ/n;k,ε;s) and a pair of independent stochastic block models S(n,λs/n;k,ε) when ε² λ s < 1 lies below the Kesten-Stigum (KS) threshold and s < √α lies below the Otter’s threshold, this resolves a remaining problem in [Guanyi Chen et al., 2024]. One of the main ingredient in our proof is to derive certain forms of algorithmic contiguity between two probability measures based on bounds on their low-degree advantage. To be more precise, consider the high-dimensional hypothesis testing problem between two probability measures ℙ and ℚ based on the sample Y. We show that if the low-degree advantage Adv_{≤D}(dℙ/dℚ) = O(1), then (assuming the low-degree conjecture) there is no efficient algorithm A such that ℚ(A(Y) = 0) = 1-o(1) and ℙ(A(Y) = 1) = Ω(1). This framework provides a useful tool for performing reductions between different inference tasks.

Cite as

Zhangsong Li. Algorithmic Contiguity from Low-Degree Conjecture and Applications in Correlated Random Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 30:1-30:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{li:LIPIcs.APPROX/RANDOM.2025.30,
  author =	{Li, Zhangsong},
  title =	{{Algorithmic Contiguity from Low-Degree Conjecture and Applications in Correlated Random Graphs}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{30:1--30:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.30},
  URN =		{urn:nbn:de:0030-drops-243965},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.30},
  annote =	{Keywords: Algorithmic Contiguity, Low-degree Conjecture, Correlated Random Graphs}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2025.93

Improved Approximation Algorithms for Capacitated Vehicle Routing with Fixed Capacity

Authors: Jingyang Zhao and Mingyu Xiao

Published in: LIPIcs, Volume 345, 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)

Abstract

The Capacitated Vehicle Routing Problem (CVRP) is one of the most extensively studied problems in combinatorial optimization. Based on customer demand, we distinguish three variants of CVRP: unit-demand, splittable, and unsplittable. In this paper, we consider k-CVRP in general metrics and on general graphs, where k is the vehicle capacity. All three versions are APX-hard for any fixed k ≥ 3. Assume that the approximation ratio of metric TSP is 3/2. We present a (5/2 - Θ(√{1/k}))-approximation algorithm for the splittable and unit-demand cases, and a (5/2 + ln 2 - Θ(√{1/k}))-approximation algorithm for the unsplittable case. Our approximation ratio is better than the previous results when k is less than a sufficiently large value, approximately 1.7 x 10⁷. For small values of k, we design independent and elegant algorithms with further improvements. For the splittable and unit-demand cases, we improve the approximation ratio from 1.792 to 1.500 for k = 3, and from 1.750 to 1.500 for k = 4. For the unsplittable case, we improve the approximation ratio from 1.792 to 1.500 for k = 3, from 2.051 to 1.750 for k = 4, and from 2.249 to 2.157 for k = 5. The approximation ratio for k = 3 surprisingly achieves the same value as in the splittable case. Our techniques, such as EX-ITP - an extension of the classic ITP method, have the potential to improve algorithms for other routing problems as well.

Cite as

Jingyang Zhao and Mingyu Xiao. Improved Approximation Algorithms for Capacitated Vehicle Routing with Fixed Capacity. In 50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 345, pp. 93:1-93:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{zhao_et_al:LIPIcs.MFCS.2025.93,
  author =	{Zhao, Jingyang and Xiao, Mingyu},
  title =	{{Improved Approximation Algorithms for Capacitated Vehicle Routing with Fixed Capacity}},
  booktitle =	{50th International Symposium on Mathematical Foundations of Computer Science (MFCS 2025)},
  pages =	{93:1--93:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-388-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{345},
  editor =	{Gawrychowski, Pawe{\l} and Mazowiecki, Filip and Skrzypczak, Micha{\l}},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2025.93},
  URN =		{urn:nbn:de:0030-drops-242008},
  doi =		{10.4230/LIPIcs.MFCS.2025.93},
  annote =	{Keywords: Combinatorial Optimization, Capacitated Vehicle Routing, Approximation Algorithms, Graph Algorithms}
}

Document

DOI: 10.4230/LIPIcs.SEA.2025.2

IBB: Fast Burrows-Wheeler Transform Construction for Length-Diverse DNA Data

Authors: Enno Adler, Stefan Böttcher, Rita Hartel, and Cederic Alexander Steininger

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)

Abstract

The Burrows-Wheeler transform (BWT) is integral to the FM-index, which is used extensively in text compression, indexing, pattern search, and bioinformatic problems as de novo assembly and read alignment. Thus, efficient construction of the BWT in terms of time and memory usage is key to these applications. We present a novel external-memory algorithm called Improved-Bucket Burrows-Wheeler transform (IBB) for constructing the BWT of DNA datasets with highly diverse sequence lengths. IBB uses a right-aligned approach to efficiently handle sequences of varying lengths, a tree-based data structure to manage relative insert positions and ranks, and fine buckets to reduce the necessary amount of input and output to external memory. Our experiments demonstrate that IBB is 10% to 40% faster than the best existing state-of-the-art BWT construction algorithms on most datasets while maintaining competitive memory consumption.

Cite as

Enno Adler, Stefan Böttcher, Rita Hartel, and Cederic Alexander Steininger. IBB: Fast Burrows-Wheeler Transform Construction for Length-Diverse DNA Data. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 2:1-2:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{adler_et_al:LIPIcs.SEA.2025.2,
  author =	{Adler, Enno and B\"{o}ttcher, Stefan and Hartel, Rita and Steininger, Cederic Alexander},
  title =	{{IBB: Fast Burrows-Wheeler Transform Construction for Length-Diverse DNA Data}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{2:1--2:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.2},
  URN =		{urn:nbn:de:0030-drops-232402},
  doi =		{10.4230/LIPIcs.SEA.2025.2},
  annote =	{Keywords: burrows-wheeler transform, self-indexes, external-memory}
}

Document

Survey

DOI: 10.4230/TGDK.3.1.3

Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1

Abstract

Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}

Document

DOI: 10.4230/OASIcs.PARMA-DITAM.2025.3

System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric

Authors: Vittoriano Muttillo, Vincenzo Stoico, Giacomo Valente, Marco Santic, Luigi Pomante, and Daniele Frigioni

Published in: OASIcs, Volume 127, 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)

Abstract

The rapidly increasing complexity of embedded systems and the critical impact of non-functional requirements demand the adoption of an appropriate system-level HW/SW co-design methodology. This methodology tries to satisfy all design requirements by simultaneously considering several alternative HW/SW implementations. In this context, early performance estimation approaches are crucial in reducing the design space, thereby minimizing design time and cost. To address the challenge of system-level performance estimation, this work presents and formalizes a novel approach based on a unifying HW/SW performance metric for early execution time estimation. The proposed approach estimates the execution time of a C function when executed by different HW/SW processor technologies. The approach is validated through an extensive experimental study, demonstrating its effectiveness and efficiency in terms of estimation error (i.e., lower than 10%) and estimation time (close to zero) when compared to existing methods in the literature.

Cite as

Vittoriano Muttillo, Vincenzo Stoico, Giacomo Valente, Marco Santic, Luigi Pomante, and Daniele Frigioni. System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric. In 16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025). Open Access Series in Informatics (OASIcs), Volume 127, pp. 3:1-3:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{muttillo_et_al:OASIcs.PARMA-DITAM.2025.3,
  author =	{Muttillo, Vittoriano and Stoico, Vincenzo and Valente, Giacomo and Santic, Marco and Pomante, Luigi and Frigioni, Daniele},
  title =	{{System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{3:1--3:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.3},
  URN =		{urn:nbn:de:0030-drops-229071},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.3},
  annote =	{Keywords: embedded systems, hw/sw co-design, performance estimation, lasso, machine learning}
}

@InProceedings{muttillo_et_al:OASIcs.PARMA-DITAM.2025.3,
  author =	{Muttillo, Vittoriano and Stoico, Vincenzo and Valente, Giacomo and Santic, Marco and Pomante, Luigi and Frigioni, Daniele},
  title =	{{System-Level Timing Performance Estimation Based on a Unifying HW/SW Performance Metric}},
  booktitle =	{16th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 14th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2025)},
  pages =	{3:1--3:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-363-8},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{127},
  editor =	{Cattaneo, Daniele and Fazio, Maria and Kosmidis, Leonidas and Morabito, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.PARMA-DITAM.2025.3},
  URN =		{urn:nbn:de:0030-drops-229071},
  doi =		{10.4230/OASIcs.PARMA-DITAM.2025.3},
  annote =	{Keywords: embedded systems, hw/sw co-design, performance estimation, lasso, machine learning}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2025.66

Graph Reconstruction via MIS Queries

Authors: Christian Konrad, Conor O'Sullivan, and Victor Traistaru

Published in: LIPIcs, Volume 325, 16th Innovations in Theoretical Computer Science Conference (ITCS 2025)

Abstract

In the Graph Reconstruction (GR) problem, a player initially only knows the vertex set V of an input graph G = (V, E) and is required to learn its set of edges E. To this end, the player submits queries to an oracle and must deduce E from the oracle’s answers. Angluin and Chen [Journal of Computer and System Sciences, 2008] resolved the number of Independent Set (IS) queries necessary and sufficient for GR on m-edge graphs. In this setting, each query consists of a subset of vertices U ⊆ V, and the oracle responds with a boolean, indicating whether U is an independent set in G. They gave algorithms that use O(m ⋅ log n) IS queries, which is best possible. In this paper, we initiate the study of GR via Maximal Independent Set (MIS) queries, a more powerful variant of IS queries. Given a query U ⊆ V, the oracle responds with any, potentially adversarially chosen, maximal independent set I ⊆ U in the induced subgraph G[U]. We show that, for GR, MIS queries are strictly more powerful than IS queries when parametrized by the maximum degree Δ of the input graph. We give tight (up to poly-logarithmic factors) upper and lower bounds for this problem: 1) We observe that the simple strategy of taking uniform independent random samples of V and submitting those to the oracle yields a non-adaptive randomized algorithm that executes O(Δ² ⋅ log n) queries and succeeds with high probability. This should be contrasted with the fact that Ω(Δ ⋅ n ⋅ log(n/Δ)) IS queries are required for such graphs, which shows that MIS queries are strictly more powerful than IS queries. Interestingly, combining the strategy of taking uniform random samples of V with the probabilistic method, we show the existence of a deterministic non-adaptive algorithm that executes O(Δ³ ⋅ log(n/Δ)) queries. 2) Regarding lower bounds, we prove that the additional Δ factor when going from randomized non-adaptive algorithms to deterministic non-adaptive algorithms is necessary. We show that every non-adaptive deterministic algorithm requires Ω(Δ³ / log² Δ) queries. For arbitrary randomized adaptive algorithms, we show that Ω(Δ²) queries are necessary in graphs of maximum degree Δ, and that Ω(log n) queries are necessary, even when the input graph is an n-vertex cycle.

Cite as

Christian Konrad, Conor O'Sullivan, and Victor Traistaru. Graph Reconstruction via MIS Queries. In 16th Innovations in Theoretical Computer Science Conference (ITCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 325, pp. 66:1-66:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{konrad_et_al:LIPIcs.ITCS.2025.66,
  author =	{Konrad, Christian and O'Sullivan, Conor and Traistaru, Victor},
  title =	{{Graph Reconstruction via MIS Queries}},
  booktitle =	{16th Innovations in Theoretical Computer Science Conference (ITCS 2025)},
  pages =	{66:1--66:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-361-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{325},
  editor =	{Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2025.66},
  URN =		{urn:nbn:de:0030-drops-226945},
  doi =		{10.4230/LIPIcs.ITCS.2025.66},
  annote =	{Keywords: Query Complexity, Graph Reconstruction, Maximal Independent Set Queries}
}

Document

DOI: 10.4230/LIPIcs.ESA.2024.91

Euclidean Capacitated Vehicle Routing in the Random Setting: A 1.55-Approximation Algorithm

Authors: Zipei Nie and Hang Zhou

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)

Abstract

We study the unit-demand capacitated vehicle routing problem in the random setting of the Euclidean plane. The objective is to visit n random terminals in a square using a set of tours of minimum total length, such that each tour visits the depot and at most k terminals. We design an algorithm combining the classical sweep heuristic and the framework for the Euclidean traveling salesman problem due to Arora [J. ACM 1998] and Mitchell [SICOMP 1999]. We show that our algorithm is a polynomial-time approximation of ratio at most 1.55 asymptotically almost surely. This improves on the prior ratio of 1.915 due to Mathieu and Zhou [RSA 2022]. In addition, we conjecture that, for any ε > 0, our algorithm is a (1+ε)-approximation asymptotically almost surely.

Cite as

Zipei Nie and Hang Zhou. Euclidean Capacitated Vehicle Routing in the Random Setting: A 1.55-Approximation Algorithm. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 91:1-91:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{nie_et_al:LIPIcs.ESA.2024.91,
  author =	{Nie, Zipei and Zhou, Hang},
  title =	{{Euclidean Capacitated Vehicle Routing in the Random Setting: A 1.55-Approximation Algorithm}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{91:1--91:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.91},
  URN =		{urn:nbn:de:0030-drops-211627},
  doi =		{10.4230/LIPIcs.ESA.2024.91},
  annote =	{Keywords: capacitated vehicle routing, approximation algorithm, combinatorial optimization}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2024.81

Faster Approximation Scheme for Euclidean k-TSP

Authors: Ernest van Wijland and Hang Zhou

Published in: LIPIcs, Volume 293, 40th International Symposium on Computational Geometry (SoCG 2024)

Abstract

In the Euclidean k-traveling salesman problem (k-TSP), we are given n points in the d-dimensional Euclidean space, for some fixed constant d ≥ 2, and a positive integer k. The goal is to find a shortest tour visiting at least k points. We give an approximation scheme for the Euclidean k-TSP in time n⋅2^O(1/ε^{d-1})⋅(log n)^{2d²⋅2^d}. This improves Arora’s approximation scheme of running time n⋅k⋅(log n)^(O(√d/ε))^{d-1}} [J. ACM 1998]. Our algorithm is Gap-ETH tight and can be derandomized by increasing the running time by a factor O(n^d).

Cite as

Ernest van Wijland and Hang Zhou. Faster Approximation Scheme for Euclidean k-TSP. In 40th International Symposium on Computational Geometry (SoCG 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 293, pp. 81:1-81:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{vanwijland_et_al:LIPIcs.SoCG.2024.81,
  author =	{van Wijland, Ernest and Zhou, Hang},
  title =	{{Faster Approximation Scheme for Euclidean k-TSP}},
  booktitle =	{40th International Symposium on Computational Geometry (SoCG 2024)},
  pages =	{81:1--81:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-316-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{293},
  editor =	{Mulzer, Wolfgang and Phillips, Jeff M.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2024.81},
  URN =		{urn:nbn:de:0030-drops-200268},
  doi =		{10.4230/LIPIcs.SoCG.2024.81},
  annote =	{Keywords: approximation algorithms, optimization, traveling salesman problem}
}

Document

Vision

DOI: 10.4230/TGDK.1.1.8

Machine Learning and Knowledge Graphs: Existing Gaps and Future Research Challenges

Authors: Claudia d'Amato, Louis Mahon, Pierre Monnin, and Giorgos Stamou

Published in: TGDK, Volume 1, Issue 1 (2023): Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge, Volume 1, Issue 1

Abstract

The graph model is nowadays largely adopted to model a wide range of knowledge and data, spanning from social networks to knowledge graphs (KGs), representing a successful paradigm of how symbolic and transparent AI can scale on the World Wide Web. However, due to their unprecedented volume, they are generally tackled by Machine Learning (ML) and mostly numeric based methods such as graph embedding models (KGE) and deep neural networks (DNNs). The latter methods have been proved lately very efficient, leading the current AI spring. In this vision paper, we introduce some of the main existing methods for combining KGs and ML, divided into two categories: those using ML to improve KGs, and those using KGs to improve results on ML tasks. From this introduction, we highlight research gaps and perspectives that we deem promising and currently under-explored for the involved research communities, spanning from KG support for LLM prompting, integration of KG semantics in ML models to symbol-based methods, interpretability of ML models, and the need for improved benchmark datasets. In our opinion, such perspectives are stepping stones in an ultimate view of KGs as central assets for neuro-symbolic and explainable AI.

Cite as

Claudia d'Amato, Louis Mahon, Pierre Monnin, and Giorgos Stamou. Machine Learning and Knowledge Graphs: Existing Gaps and Future Research Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 8:1-8:35, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@Article{damato_et_al:TGDK.1.1.8,
  author =	{d'Amato, Claudia and Mahon, Louis and Monnin, Pierre and Stamou, Giorgos},
  title =	{{Machine Learning and Knowledge Graphs: Existing Gaps and Future Research Challenges}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{8:1--8:35},
  year =	{2023},
  volume =	{1},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.1.1.8},
  URN =		{urn:nbn:de:0030-drops-194824},
  doi =		{10.4230/TGDK.1.1.8},
  annote =	{Keywords: Graph-based Learning, Knowledge Graph Embeddings, Large Language Models, Explainable AI, Knowledge Graph Completion \& Curation}
}

24 Search Results for "Zhou, Hang"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message