23 Search Results for "Yang, Sheng"


Document
Research
Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web

Authors: Florian Ruosch, Cristina Sarasua, and Abraham Bernstein

Published in: TGDK, Volume 3, Issue 3 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 3


Abstract
In Argument Mining, predicting argumentative relations between texts (or spans) remains one of the most challenging aspects, even more so in the cross-document setting. This paper makes three key contributions to advance research in this domain. We first extend an existing dataset, the Sci-Arg corpus, by annotating it with explicit inter-document argumentative relations, thereby allowing arguments to be distributed over several documents forming an Argument Web; these new annotations are published using Semantic Web technologies (RDF, OWL). Second, we explore and evaluate three automated approaches for predicting these inter-document argumentative relations, establishing critical baselines on the new dataset. We find that a simple classifier based on discourse indicators with access to context outperforms neural methods. Third, we conduct a comparative analysis of these approaches for both intra- and inter-document settings, identifying statistically significant differences in results that indicate the necessity of distinguishing between these two scenarios. Our findings highlight significant challenges in this complex domain and open crucial avenues for future research on the Argument Web of Science, particularly for those interested in leveraging Semantic Web technologies and knowledge graphs to understand scholarly discourse. With this, we provide the first stepping stones in the form of a benchmark dataset, three baseline methods, and an initial analysis for a systematic exploration of this field relevant to the Web of Data and Science.

Cite as

Florian Ruosch, Cristina Sarasua, and Abraham Bernstein. Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 3, pp. 4:1-4:33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{ruosch_et_al:TGDK.3.3.4,
  author =	{Ruosch, Florian and Sarasua, Cristina and Bernstein, Abraham},
  title =	{{Mining Inter-Document Argument Structures in Scientific Papers for an Argument Web}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{4:1--4:33},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{3},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.3.4},
  URN =		{urn:nbn:de:0030-drops-252159},
  doi =		{10.4230/TGDK.3.3.4},
  annote =	{Keywords: Argument Mining, Large Language Models, Knowledge Graphs, Link Prediction}
}
Document
Cutoff for the Swendsen–Wang Dynamics on the Complete Graph

Authors: Antonio Blanca and Zhezheng Song

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)


Abstract
We study the speed of convergence of the Swendsen-Wang (SW) dynamics for the q-state ferromagnetic Potts model on the n-vertex complete graph, known as the mean-field model. The SW dynamics was introduced as an attractive alternative to the local Glauber dynamics, often offering faster convergence rates to stationarity in a variety of settings. A series of works have characterized the asymptotic behavior of the speed of convergence of the mean-field SW dynamics for all q ≥ 2 and all values of the inverse temperature parameter β > 0. In particular, it is known that when β > q the mixing time of the SW dynamics is Θ(log n). We strengthen this result by showing that for all β > q, there exists a constant c(β,q) > 0 such that the mixing time of the SW dynamics is c(β,q) log n + Θ(1). This implies that the mean-field SW dynamics exhibits the cutoff phenomenon in this temperature regime, demonstrating that this Markov chain undergoes a sharp transition from "far from stationarity" to "well-mixed" within a narrow Θ(1) time window. The presence of cutoff is algorithmically significant, as simulating the chain for fewer steps than its mixing time could lead to highly biased samples.

Cite as

Antonio Blanca and Zhezheng Song. Cutoff for the Swendsen–Wang Dynamics on the Complete Graph. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 17:1-17:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{blanca_et_al:LIPIcs.FSTTCS.2025.17,
  author =	{Blanca, Antonio and Song, Zhezheng},
  title =	{{Cutoff for the Swendsen–Wang Dynamics on the Complete Graph}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{17:1--17:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.17},
  URN =		{urn:nbn:de:0030-drops-250987},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.17},
  annote =	{Keywords: Markov chains, mixing times, cutoff phenomenon, Potts model, mean-field}
}
Document
Survey
Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2


Abstract
In recent years, knowledge graphs have gained interest and witnessed widespread applications in various domains, such as information retrieval, question-answering, recommendation systems, amongst others. Large-scale knowledge graphs to this end have demonstrated their utility in effectively representing structured knowledge. To further facilitate the application of machine learning techniques, knowledge graph embedding models have been developed. Such models can transform entities and relationships within knowledge graphs into vectors. However, these embedding models often face challenges related to noise, missing information, distribution shift, adversarial attacks, etc. This can lead to sub-optimal embeddings and incorrect inferences, thereby negatively impacting downstream applications. While the existing literature has focused so far on adversarial attacks on KGE models, the challenges related to the other critical aspects remain unexplored. In this paper, we, first of all, give a unified definition of resilience, encompassing several factors such as generalisation, in-distribution generalization, distribution adaption, and robustness. After formalizing these concepts for machine learning in general, we define them in the context of knowledge graphs. To find the gap in the existing works on resilience in the context of knowledge graphs, we perform a systematic survey, taking into account all these aspects mentioned previously. Our survey results show that most of the existing works focus on a specific aspect of resilience, namely robustness. After categorizing such works based on their respective aspects of resilience, we discuss the challenges and future research directions.

Cite as

Arnab Sharma, N'Dah Jean Kouagou, and Axel-Cyrille Ngonga Ngomo. Resilience in Knowledge Graph Embeddings. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 1:1-1:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{sharma_et_al:TGDK.3.2.1,
  author =	{Sharma, Arnab and Kouagou, N'Dah Jean and Ngomo, Axel-Cyrille Ngonga},
  title =	{{Resilience in Knowledge Graph Embeddings}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{1:1--1:38},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.1},
  URN =		{urn:nbn:de:0030-drops-248117},
  doi =		{10.4230/TGDK.3.2.1},
  annote =	{Keywords: Knowledge graphs, Resilience, Robustness}
}
Document
Constructing Long Paths in Graph Streams

Authors: Christian Konrad and Chhaya Trehan

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
In the graph stream model of computation, an algorithm processes the edges of an n-vertex input graph in one or more sequential passes while using a memory that is sublinear in the input size. The streaming model poses significant challenges for algorithmically constructing long paths. Many known algorithms that are tasked with extending an existing path as a subroutine require an entire pass over the input to add a single additional edge. This raises a fundamental question: Are multiple passes inherently necessary to construct paths of non-trivial lengths, or can a single pass suffice? To address this question, we systematically study the Longest Path problem in the one-pass streaming model. In this problem, given a desired approximation factor α, the objective is to compute a path of length at least lp(G)/α, where lp(G) is the length of a longest path in the input graph G. We study the problem in the insertion-only and the insertion-deletion streaming models, and we give algorithms as well as space lower bounds for both undirected and directed graphs. Our results are: 1) We show that for undirected graphs, in both the insertion-only and the insertion-deletion models, there are semi-streaming algorithms, i.e., algorithms that use space O(n poly log n), that compute a path of length at least d/3 with high probability, where d is the average degree of the input graph. These algorithms can also yield an α-approximation to Longest Path using space Õ(n²/α). 2) Next, we show that such a result cannot be achieved for directed graphs, even in the insertion-only model. We show that computing a (n^{1-o(1)})-approximation to Longest Path in directed graphs in the insertion-only model requires space Ω(n²). This result is in line with recent results that demonstrate that processing directed graphs is often significantly harder than undirected graphs in the streaming model. 3) We further complement our results with two additional lower bounds. First, we show that semi-streaming space is insufficient for small constant factor approximations to Longest Path for undirected graphs in the insertion-only model. Last, in undirected graphs in the insertion-deletion model, we show that computing an α-approximation requires space Ω(n²/α³).

Cite as

Christian Konrad and Chhaya Trehan. Constructing Long Paths in Graph Streams. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 22:1-22:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{konrad_et_al:LIPIcs.ESA.2025.22,
  author =	{Konrad, Christian and Trehan, Chhaya},
  title =	{{Constructing Long Paths in Graph Streams}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{22:1--22:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.22},
  URN =		{urn:nbn:de:0030-drops-244902},
  doi =		{10.4230/LIPIcs.ESA.2025.22},
  annote =	{Keywords: Longest Path Problem, Streaming Algorithms, One-way Two-party Communication Complexity}
}
Document
Efficiency of Learned Indexes on Genome Spectra

Authors: Md. Hasin Abrar, Paul Medvedev, and Giorgio Vinciguerra

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
Data structures on a multiset of genomic k-mers are at the heart of many bioinformatic tools. As genomic datasets grow in scale, the efficiency of these data structures increasingly depends on how well they leverage the inherent patterns in the data. One recent and effective approach is the use of learned indexes that approximate the rank function of a multiset using a piecewise linear function with very few segments. However, theoretical worst-case analysis struggles to predict the practical performance of these indexes. We address this limitation by developing a novel measure of piecewise-linear approximability of the data, called CaPLa (Canonical Piecewise Linear approximability). CaPLa builds on the empirical observation that a power-law model often serves as a reasonable proxy for piecewise linear-approximability, while explicitly accounting for deviations from a true power-law fit. We prove basic properties of CaPLa and present an efficient algorithm to compute it. We then demonstrate that CaPLa can accurately predict space bounds for data structures on real data. Empirically, we analyze over 500 genomes through the lens of CaPLa, revealing that it varies widely across the tree of life and even within individual genomes. Finally, we study the robustness of CaPLa as a measure and the factors that make genomic k-mer multisets different from random ones.

Cite as

Md. Hasin Abrar, Paul Medvedev, and Giorgio Vinciguerra. Efficiency of Learned Indexes on Genome Spectra. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{abrar_et_al:LIPIcs.ESA.2025.18,
  author =	{Abrar, Md. Hasin and Medvedev, Paul and Vinciguerra, Giorgio},
  title =	{{Efficiency of Learned Indexes on Genome Spectra}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.18},
  URN =		{urn:nbn:de:0030-drops-244865},
  doi =		{10.4230/LIPIcs.ESA.2025.18},
  annote =	{Keywords: Genome spectra, piecewise linear approximation, learned index, k-mers}
}
Document
Integrating Human-In-The-Loop AI to Tackle Space Communication Delay Challenges

Authors: Nikos Mavrakis, Effie Lai-Chong Law, and Hubert P. H. Shum

Published in: OASIcs, Volume 130, Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)


Abstract
Deep space missions face significant communication delays that disrupt both operational workflows and psychological support for crew members. Unlike low Earth orbit operations, delays ranging from several minutes to nearly an hour make real-time communication with mission control infeasible, forcing crews to act with greater independence under uncertain conditions. This position paper examines how human-in-the-loop AI, digital twins, and edge AI can be integrated to mitigate these delays while maintaining astronaut autonomy and engagement. We argue that human-in-the-loop AI enables decision-making processes that are responsive to local context while remaining adaptable to changing mission demands. Digital twins offer real-time simulation and predictive modelling capabilities, allowing astronauts to explore options and troubleshoot without waiting for ground input. Edge AI brings computation closer to data sources, enabling low-latency inference onboard spacecraft for time-critical decisions. These ideas are explored through two use cases: using deepfakes to support emotionally resonant communication with loved ones, and applying visual-language models for onboard fault diagnosis and adaptive task replanning. We conclude with reflections on system design challenges under constrained and high-stakes conditions.

Cite as

Nikos Mavrakis, Effie Lai-Chong Law, and Hubert P. H. Shum. Integrating Human-In-The-Loop AI to Tackle Space Communication Delay Challenges. In Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025). Open Access Series in Informatics (OASIcs), Volume 130, pp. 15:1-15:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{mavrakis_et_al:OASIcs.SpaceCHI.2025.15,
  author =	{Mavrakis, Nikos and Law, Effie Lai-Chong and Shum, Hubert P. H.},
  title =	{{Integrating Human-In-The-Loop AI to Tackle Space Communication Delay Challenges}},
  booktitle =	{Advancing Human-Computer Interaction for Space Exploration (SpaceCHI 2025)},
  pages =	{15:1--15:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-384-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{130},
  editor =	{Bensch, Leonie and Nilsson, Tommy and Nisser, Martin and Pataranutaporn, Pat and Schmidt, Albrecht and Sumini, Valentina},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SpaceCHI.2025.15},
  URN =		{urn:nbn:de:0030-drops-240051},
  doi =		{10.4230/OASIcs.SpaceCHI.2025.15},
  annote =	{Keywords: Human-in-the-loop AI, communication delays, human spaceflight}
}
Document
RANDOM
Algorithmic Contiguity from Low-Degree Conjecture and Applications in Correlated Random Graphs

Authors: Zhangsong Li

Published in: LIPIcs, Volume 353, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)


Abstract
In this paper, assuming a natural strengthening of the low-degree conjecture, we provide evidence of computational hardness for two problems: (1) the (partial) matching recovery problem in the sparse correlated Erdős-Rényi graphs G(n,q;ρ) when the edge-density q = n^{-1+o(1)} and the correlation ρ < √{α} lies below the Otter’s threshold, this resolves a remaining problem in [Jian Ding et al., 2023]; (2) the detection problem between a pair of correlated sparse stochastic block model S(n,λ/n;k,ε;s) and a pair of independent stochastic block models S(n,λs/n;k,ε) when ε² λ s < 1 lies below the Kesten-Stigum (KS) threshold and s < √α lies below the Otter’s threshold, this resolves a remaining problem in [Guanyi Chen et al., 2024]. One of the main ingredient in our proof is to derive certain forms of algorithmic contiguity between two probability measures based on bounds on their low-degree advantage. To be more precise, consider the high-dimensional hypothesis testing problem between two probability measures ℙ and ℚ based on the sample Y. We show that if the low-degree advantage Adv_{≤D}(dℙ/dℚ) = O(1), then (assuming the low-degree conjecture) there is no efficient algorithm A such that ℚ(A(Y) = 0) = 1-o(1) and ℙ(A(Y) = 1) = Ω(1). This framework provides a useful tool for performing reductions between different inference tasks.

Cite as

Zhangsong Li. Algorithmic Contiguity from Low-Degree Conjecture and Applications in Correlated Random Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 353, pp. 30:1-30:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{li:LIPIcs.APPROX/RANDOM.2025.30,
  author =	{Li, Zhangsong},
  title =	{{Algorithmic Contiguity from Low-Degree Conjecture and Applications in Correlated Random Graphs}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2025)},
  pages =	{30:1--30:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-397-3},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{353},
  editor =	{Ene, Alina and Chattopadhyay, Eshan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2025.30},
  URN =		{urn:nbn:de:0030-drops-243965},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2025.30},
  annote =	{Keywords: Algorithmic Contiguity, Low-degree Conjecture, Correlated Random Graphs}
}
Document
Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation

Authors: Eftychia Koukouraki, Auriol Degbelo, and Christian Kray

Published in: LIPIcs, Volume 346, 13th International Conference on Geographic Information Science (GIScience 2025)


Abstract
Reproducibility is a key principle of the modern scientific method. Maps, as an important means of communicating scientific results in GIScience and across disciplines, should be reproducible. Currently, map reproducibility assessment is done manually, which makes the assessment process tedious and time-consuming, ultimately limiting its efficiency. Hence, this work explores the extent to which Visual Question-Answering (VQA) can be used to automate some tasks relevant to map reproducibility assessment. We selected five state-of-the-art vision language models (VLMs) and followed a three-step approach to evaluate their ability to discriminate between maps and other images, interpret map content, and compare two map images using VQA. Our results show that current VLMs already possess map-reading capabilities and demonstrate understanding of spatial concepts, such as cardinal directions, geographic scope, and legend interpretation. Our paper demonstrates the potential of using VQA to support reproducibility assessment and highlights the outstanding issues that need to be addressed to achieve accurate, trustworthy map descriptions, thereby reducing the time and effort required by human evaluators.

Cite as

Eftychia Koukouraki, Auriol Degbelo, and Christian Kray. Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation. In 13th International Conference on Geographic Information Science (GIScience 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 346, pp. 13:1-13:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{koukouraki_et_al:LIPIcs.GIScience.2025.13,
  author =	{Koukouraki, Eftychia and Degbelo, Auriol and Kray, Christian},
  title =	{{Assessing Map Reproducibility with Visual Question-Answering: An Empirical Evaluation}},
  booktitle =	{13th International Conference on Geographic Information Science (GIScience 2025)},
  pages =	{13:1--13:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-378-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{346},
  editor =	{Sila-Nowicka, Katarzyna and Moore, Antoni and O'Sullivan, David and Adams, Benjamin and Gahegan, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2025.13},
  URN =		{urn:nbn:de:0030-drops-238426},
  doi =		{10.4230/LIPIcs.GIScience.2025.13},
  annote =	{Keywords: map comparison, computational reproducibility, visual question answering, large language models, GeoAI}
}
Document
Algorithms for Computing Very Large BWTs: a Short Survey

Authors: Diego Díaz-Domínguez, Lavinia Egidi, Veronica Guerrini, Felipe A. Louza, and Giovanna Rosone

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)


Abstract
The Burrows-Wheeler Transform (BWT) is a fundamental string transformation that, although initially introduced for data compression, has been extensively utilized across various domains, including text indexing and pattern matching within large datasets. Although the BWT construction is linear, the constants make the task impractical for large datasets, and as highlighted by Ferragina et al. [Paolo Ferragina et al., 2012], "to use it, one must first build it!". Thus, the construction of the BWT remains a significant challenge. For these reasons, during the past three decades there has been a succession of new algorithms for its construction using techniques that work in external memory or that use text compression. In this survey, we revise some of the most important advancements and tools presented in the past years for computing large BWTs exploiting external memory or text compression approaches without using additional information about the data.

Cite as

Diego Díaz-Domínguez, Lavinia Egidi, Veronica Guerrini, Felipe A. Louza, and Giovanna Rosone. Algorithms for Computing Very Large BWTs: a Short Survey. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 7:1-7:28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{diazdominguez_et_al:OASIcs.Manzini.7,
  author =	{D{\'\i}az-Dom{\'\i}nguez, Diego and Egidi, Lavinia and Guerrini, Veronica and Louza, Felipe A. and Rosone, Giovanna},
  title =	{{Algorithms for Computing Very Large BWTs: a Short Survey}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{7:1--7:28},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.7},
  URN =		{urn:nbn:de:0030-drops-239151},
  doi =		{10.4230/OASIcs.Manzini.7},
  annote =	{Keywords: Burrows-Wheeler transform, Extended Burrows-Wheeler transform, external memory, text compression, longest common prefix}
}
Document
Solving the Agile Earth Observation Satellite Scheduling Problem with CP and Local Search

Authors: Valentin Antuori, Damien T. Wojtowicz, and Emmanuel Hebrard

Published in: LIPIcs, Volume 340, 31st International Conference on Principles and Practice of Constraint Programming (CP 2025)


Abstract
The increasing hunger for remote sensing data fuels a boom in satellite imagery, leading to larger agile Earth observation satellite (AEOS) constellations. Therefore, instances of the AEOS scheduling problem (AEOSSP) has become harder to solve. As most existing approaches to solve AEOSSP are designed for a single spacecraft or smaller constellations in mind, they are not tailored to the need of our industrial partner that is about to launch a constellation of 20 AEOSs. Hence, we designed a local search solver able to schedule observations and downloads at such a scale. It relies on solving a series of sub-problems as travelling salesman problem with time windows (TSPTW), first greedily, then using a CP-SAT exact solver in order to find a solution when the greedy insertion fails. Lastly, it schedules downloads and enforces memory constraints with greedy algorithms. Experiments were carried out on instances from the literature as well as generated instances from a simulator we designed. Our experiments show that using CP to solve the sub-problem significantly improve the solutions, and overall our method is slightly better than state-of-the-art approaches.

Cite as

Valentin Antuori, Damien T. Wojtowicz, and Emmanuel Hebrard. Solving the Agile Earth Observation Satellite Scheduling Problem with CP and Local Search. In 31st International Conference on Principles and Practice of Constraint Programming (CP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 340, pp. 3:1-3:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{antuori_et_al:LIPIcs.CP.2025.3,
  author =	{Antuori, Valentin and Wojtowicz, Damien T. and Hebrard, Emmanuel},
  title =	{{Solving the Agile Earth Observation Satellite Scheduling Problem with CP and Local Search}},
  booktitle =	{31st International Conference on Principles and Practice of Constraint Programming (CP 2025)},
  pages =	{3:1--3:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-380-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{340},
  editor =	{de la Banda, Maria Garcia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2025.3},
  URN =		{urn:nbn:de:0030-drops-238647},
  doi =		{10.4230/LIPIcs.CP.2025.3},
  annote =	{Keywords: Local Search, Greedy Algorithms, Aerospace Applications}
}
Document
CluStRE: Streaming Graph Clustering with Multi-Stage Refinement

Authors: Adil Chhabra, Shai Dorian Peretz, and Christian Schulz

Published in: LIPIcs, Volume 338, 23rd International Symposium on Experimental Algorithms (SEA 2025)


Abstract
We present CluStRE, a novel streaming graph clustering algorithm that balances computational efficiency with high-quality clustering using multi-stage refinement. Unlike traditional in-memory clustering approaches, CluStRE processes graphs in a streaming setting, significantly reducing memory overhead while leveraging re-streaming and evolutionary heuristics to improve solution quality. Our method dynamically constructs a quotient graph, enabling modularity-based optimization while efficiently handling large-scale graphs. We introduce multiple configurations of CluStRE to provide trade-offs between speed, memory consumption, and clustering quality. Experimental evaluations demonstrate that CluStRE improves solution quality by 89.8%, operates 2.6× faster, and uses less than two-thirds of the memory required by the state-of-the-art streaming clustering algorithm on average. Moreover, our strongest mode enhances solution quality by up to 150% on average. With this, CluStRE achieves comparable solution quality to in-memory algorithms, i.e. over 96% of the quality of clustering approaches, including Louvain, effectively bridging the gap between streaming and traditional clustering methods.

Cite as

Adil Chhabra, Shai Dorian Peretz, and Christian Schulz. CluStRE: Streaming Graph Clustering with Multi-Stage Refinement. In 23rd International Symposium on Experimental Algorithms (SEA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 338, pp. 11:1-11:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{chhabra_et_al:LIPIcs.SEA.2025.11,
  author =	{Chhabra, Adil and Dorian Peretz, Shai and Schulz, Christian},
  title =	{{CluStRE: Streaming Graph Clustering with Multi-Stage Refinement}},
  booktitle =	{23rd International Symposium on Experimental Algorithms (SEA 2025)},
  pages =	{11:1--11:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-375-1},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{338},
  editor =	{Mutzel, Petra and Prezza, Nicola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.11},
  URN =		{urn:nbn:de:0030-drops-232493},
  doi =		{10.4230/LIPIcs.SEA.2025.11},
  annote =	{Keywords: graph clustering, community, streaming, online, memetic, evolutionary}
}
Document
Track A: Algorithms, Complexity and Games
3.415-Approximation for Coflow Scheduling via Iterated Rounding

Authors: Lars Rohwedder and Leander Schnaars

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
We provide an algorithm giving a 140/41 (< 3.415)-approximation for Coflow Scheduling and a 4.36-approximation for Coflow Scheduling with release dates. This improves upon the best known 4- and respectively 5-approximations and addresses an open question posed by Agarwal, Rajakrishnan, Narayan, Agarwal, Shmoys, and Vahdat [Agarwal et al., 2018], Fukunaga [Fukunaga, 2022], and others. We additionally show that in an asymptotic setting, the algorithm achieves a (2+ε)-approximation, which is essentially optimal under ℙ ≠ NP. The improvements are achieved using a novel edge allocation scheme using iterated LP rounding together with a framework which enables establishing strong bounds for combinations of several edge allocation algorithms.

Cite as

Lars Rohwedder and Leander Schnaars. 3.415-Approximation for Coflow Scheduling via Iterated Rounding. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 128:1-128:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{rohwedder_et_al:LIPIcs.ICALP.2025.128,
  author =	{Rohwedder, Lars and Schnaars, Leander},
  title =	{{3.415-Approximation for Coflow Scheduling via Iterated Rounding}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{128:1--128:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.128},
  URN =		{urn:nbn:de:0030-drops-235050},
  doi =		{10.4230/LIPIcs.ICALP.2025.128},
  annote =	{Keywords: Coflow Scheduling, Approximation Algorithms, Iterated Rounding}
}
Document
Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories

Authors: Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)


Abstract
Functionality-specific vulnerabilities, which mainly occur in Application Programming Interfaces (APIs) with specific functionalities, are crucial for software developers to detect and avoid. When detecting individual functionality-specific vulnerabilities, the existing two categories of approaches are ineffective because they consider only the API bodies and are unable to handle diverse implementations of functionality-equivalent APIs. To effectively detect functionality-specific vulnerabilities, we propose APISS, the first approach to utilize API doc strings and signatures instead of API bodies. APISS first retrieves functionality-equivalent APIs for APIs with existing vulnerabilities and then migrates Proof-of-Concepts (PoCs) of the existing vulnerabilities for newly detected vulnerable APIs. To retrieve functionality-equivalent APIs, we leverage a Large Language Model for API embedding to improve the accuracy and address the effectiveness and scalability issues suffered by the existing approaches. To migrate PoCs of the existing vulnerabilities for newly detected vulnerable APIs, we design a semi-automatic schema to substantially reduce manual costs. We conduct a comprehensive evaluation to empirically compare APISS with four state-of-the-art approaches of detecting vulnerabilities and two state-of-the-art approaches of retrieving functionality-equivalent APIs. The evaluation subjects include 180 widely used Java repositories using 10 existing vulnerabilities, along with their PoCs. The results show that APISS effectively retrieves functionality-equivalent APIs, achieving a Top-1 Accuracy of 0.81 while the best of the baselines under comparison achieves only 0.55. APISS is highly efficient: the manual costs are within 10 minutes per vulnerability and the end-to-end runtime overhead of testing one candidate API is less than 2 hours. APISS detects 179 new vulnerabilities and receives 60 new CVE IDs, bringing high value to security practice.

Cite as

Tianyu Chen, Zeyu Wang, Lin Li, Ding Li, Zongyang Li, Xiaoning Chang, Pan Bian, Guangtai Liang, Qianxiang Wang, and Tao Xie. Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 6:1-6:27, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ECOOP.2025.6,
  author =	{Chen, Tianyu and Wang, Zeyu and Li, Lin and Li, Ding and Li, Zongyang and Chang, Xiaoning and Bian, Pan and Liang, Guangtai and Wang, Qianxiang and Xie, Tao},
  title =	{{Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{6:1--6:27},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.6},
  URN =		{urn:nbn:de:0030-drops-232999},
  doi =		{10.4230/LIPIcs.ECOOP.2025.6},
  annote =	{Keywords: Application Security, Vulnerability Detection, Large Language Model}
}
Document
Survey
Uncertainty Management in the Construction of Knowledge Graphs: A Survey

Authors: Lucas Jarnac, Yoan Chabot, and Miguel Couceiro

Published in: TGDK, Volume 3, Issue 1 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 1


Abstract
Knowledge Graphs (KGs) are a major asset for companies thanks to their great flexibility in data representation and their numerous applications, e.g., vocabulary sharing, Q&A or recommendation systems. To build a KG, it is a common practice to rely on automatic methods for extracting knowledge from various heterogeneous sources. However, in a noisy and uncertain world, knowledge may not be reliable and conflicts between data sources may occur. Integrating unreliable data would directly impact the use of the KG, therefore such conflicts must be resolved. This could be done manually by selecting the best data to integrate. This first approach is highly accurate, but costly and time-consuming. That is why recent efforts focus on automatic approaches, which represent a challenging task since it requires handling the uncertainty of extracted knowledge throughout its integration into the KG. We survey state-of-the-art approaches in this direction and present constructions of both open and enterprise KGs. We then describe different knowledge extraction methods and discuss downstream tasks after knowledge acquisition, including KG completion using embedding models, knowledge alignment, and knowledge fusion in order to address the problem of knowledge uncertainty in KG construction. We conclude with a discussion on the remaining challenges and perspectives when constructing a KG taking into account uncertainty.

Cite as

Lucas Jarnac, Yoan Chabot, and Miguel Couceiro. Uncertainty Management in the Construction of Knowledge Graphs: A Survey. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 1, pp. 3:1-3:48, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@Article{jarnac_et_al:TGDK.3.1.3,
  author =	{Jarnac, Lucas and Chabot, Yoan and Couceiro, Miguel},
  title =	{{Uncertainty Management in the Construction of Knowledge Graphs: A Survey}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:48},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{1},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.1.3},
  URN =		{urn:nbn:de:0030-drops-233733},
  doi =		{10.4230/TGDK.3.1.3},
  annote =	{Keywords: Knowledge reconciliation, Uncertainty, Heterogeneous sources, Knowledge graph construction}
}
Document
Formal Verification of a Fail-Safe Cross-Chain Bridge

Authors: Filip Marić, Bernhard Scholz, and Pavle Subotić

Published in: OASIcs, Volume 129, 6th International Workshop on Formal Methods for Blockchains (FMBC 2025)


Abstract
Cross-chain bridges are financial services that interconnect blockchains. High monetary values flow through these bridges, and their security must be safeguarded. However, designing real-world cross-chain bridges is a difficult endeavor. Due to blockchain’s closed-world nature, tokens cannot be transferred from a sender to a receiver chain; on the contrary, they need complex logic that maintains an equilibrium on both chains, even if either the chains or the bridge fail. This paper formally verifies a model of a novel fail-safe cross-chain bridge to ensure correctness. We define formal requirements and prove the bridge is safe using the Isabelle/HOL proof assistant.

Cite as

Filip Marić, Bernhard Scholz, and Pavle Subotić. Formal Verification of a Fail-Safe Cross-Chain Bridge. In 6th International Workshop on Formal Methods for Blockchains (FMBC 2025). Open Access Series in Informatics (OASIcs), Volume 129, pp. 8:1-8:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{maric_et_al:OASIcs.FMBC.2025.8,
  author =	{Mari\'{c}, Filip and Scholz, Bernhard and Suboti\'{c}, Pavle},
  title =	{{Formal Verification of a Fail-Safe Cross-Chain Bridge}},
  booktitle =	{6th International Workshop on Formal Methods for Blockchains (FMBC 2025)},
  pages =	{8:1--8:18},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-371-3},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{129},
  editor =	{Marmsoler, Diego and Xu, Meng},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.FMBC.2025.8},
  URN =		{urn:nbn:de:0030-drops-230342},
  doi =		{10.4230/OASIcs.FMBC.2025.8},
  annote =	{Keywords: Cross-Chain Bridge, Formal Verification, Logic, Security}
}
  • Refine by Type
  • 23 Document/PDF
  • 19 Document/HTML

  • Refine by Publication Year
  • 17 2025
  • 1 2024
  • 3 2023
  • 1 2022
  • 1 2016

  • Refine by Author
  • 3 Yang, Sheng
  • 2 Khuller, Samir
  • 1 Abrar, Md. Hasin
  • 1 Alghamdi, Kholoud Saad
  • 1 Antuori, Valentin
  • Show More...

  • Refine by Series/Journal
  • 13 LIPIcs
  • 4 OASIcs
  • 6 TGDK

  • Refine by Classification
  • 3 Information systems → Data mining
  • 2 Computing methodologies → Knowledge representation and reasoning
  • 2 Information systems → Graph-based database models
  • 2 Information systems → Information integration
  • 2 Mathematics of computing → Random graphs
  • Show More...

  • Refine by Keyword
  • 2 Approximation Algorithms
  • 2 Knowledge Graphs
  • 2 Knowledge graphs
  • 2 Large Language Models
  • 1 Aerospace Applications
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail