DROPS

Document

DOI: 10.4230/LIPIcs.ISAAC.2025.18

An Optimal Algorithm for the Stacker Crane Problem on Fixed Topologies

Authors: Yike Chen, Ke Shi, and Chao Xu

Published in: LIPIcs, Volume 359, 36th International Symposium on Algorithms and Computation (ISAAC 2025)

Abstract

The Stacker Crane Problem (SCP) is a variant of the Traveling Salesman Problem. In SCP, pairs of pickup and delivery points are designated on a graph, and a crane must visit these points to move objects from each pickup location to its respective delivery point. The goal is to minimize the total distance traveled. SCP is known to be NP-hard, even on trees. The only positive results, in terms of polynomial-time solvability, apply to graphs that are topologically equivalent to a path or a cycle. We propose an algorithm that is optimal for each fixed topology, running in near-linear time. This is achieved by demonstrating that the problem is fixed-parameter tractable (FPT) when parameterized by both the cycle rank and the number of branch vertices.

Cite as

Yike Chen, Ke Shi, and Chao Xu. An Optimal Algorithm for the Stacker Crane Problem on Fixed Topologies. In 36th International Symposium on Algorithms and Computation (ISAAC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 359, pp. 18:1-18:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{chen_et_al:LIPIcs.ISAAC.2025.18,
  author =	{Chen, Yike and Shi, Ke and Xu, Chao},
  title =	{{An Optimal Algorithm for the Stacker Crane Problem on Fixed Topologies}},
  booktitle =	{36th International Symposium on Algorithms and Computation (ISAAC 2025)},
  pages =	{18:1--18:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-408-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{359},
  editor =	{Chen, Ho-Lin and Hon, Wing-Kai and Tsai, Meng-Tsung},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2025.18},
  URN =		{urn:nbn:de:0030-drops-249269},
  doi =		{10.4230/LIPIcs.ISAAC.2025.18},
  annote =	{Keywords: Stacker Crane Problem, Fixed-Parameter Tractable, Min-Cost Circulation}
}

Document

Research

DOI: 10.4230/TGDK.3.2.3

GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Authors: Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke

Published in: TGDK, Volume 3, Issue 2 (2025). Transactions on Graph Data and Knowledge, Volume 3, Issue 2

Abstract

Retrieval Augmented Generation (RAG) is seeing rapid adoption in industry to enable employees to query information captured in proprietary data for their organisation. In this work, we test the impact of domain-relevant knowledge graph schemas on the results of Microsoft’s GraphRAG pipeline. Our approach aims to address the poor quality of GraphRAG responses on technical reports rich in domain-specific terms. The use case involves technical reports about geology, chemistry and mineral processing published by the Minerals Research Institute of Western Australia (MRIWA). Four schemas are considered: a simple five-class minerals domain expert-developed schema, an expanded minerals domain schema, the Microsoft GraphRAG auto-generated schema, and a schema-less GraphRAG. These are compared to a conventional baseline RAG. Performance is evaluated using a scoring approach that accounts for the mix of correct, incorrect, additional, and missing content in RAG responses. The results show that the simple five-class minerals domain schema extracts approximately 10% more entities from the MRIWA reports than the other schema options. Additionally, both the five-class and the expanded eight-class minerals domain schemas produce the most factually correct answers and the fewest hallucinations. We attribute this to the minerals-specific schemas extracting more relevant, domain-specific information during the Indexing stage. As a result, the Query stage’s context window includes more high-value content. This contributes to the observed improvement in answer quality compared to the other pipelines. In contrast, pipelines with fewer domain-related entities in the KG retrieve less valuable information, leaving more room for irrelevant content in the context window. Baseline RAG responses were typically shorter, less complete, and contained more hallucinations compared to our GraphRAG pipelines. We provide a complete set of resources at https://github.com/nlp-tlp/GraphRAG-on-Minerals-Domain/tree/main. These resources include links to the MRIWA reports, a set of questions (from simple to challenging) along with domain-expert curated answers, schemas, and evaluations of the pipelines.

Cite as

Henri Scaffidi, Melinda Hodkiewicz, Caitlin Woods, and Nicole Roocke. GraphRAG on Technical Documents - Impact of Knowledge Graph Schema. In Transactions on Graph Data and Knowledge (TGDK), Volume 3, Issue 2, pp. 3:1-3:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@Article{scaffidi_et_al:TGDK.3.2.3,
  author =	{Scaffidi, Henri and Hodkiewicz, Melinda and Woods, Caitlin and Roocke, Nicole},
  title =	{{GraphRAG on Technical Documents - Impact of Knowledge Graph Schema}},
  journal =	{Transactions on Graph Data and Knowledge},
  pages =	{3:1--3:24},
  ISSN =	{2942-7517},
  year =	{2025},
  volume =	{3},
  number =	{2},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/TGDK.3.2.3},
  URN =		{urn:nbn:de:0030-drops-248131},
  doi =		{10.4230/TGDK.3.2.3},
  annote =	{Keywords: RAG, minerals, local search, global search, entity extraction, competency questions}
}

Document

DOI: 10.4230/LIPIcs.ESA.2025.18

Efficiency of Learned Indexes on Genome Spectra

Authors: Md. Hasin Abrar, Paul Medvedev, and Giorgio Vinciguerra

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)

Abstract

Data structures on a multiset of genomic k-mers are at the heart of many bioinformatic tools. As genomic datasets grow in scale, the efficiency of these data structures increasingly depends on how well they leverage the inherent patterns in the data. One recent and effective approach is the use of learned indexes that approximate the rank function of a multiset using a piecewise linear function with very few segments. However, theoretical worst-case analysis struggles to predict the practical performance of these indexes. We address this limitation by developing a novel measure of piecewise-linear approximability of the data, called CaPLa (Canonical Piecewise Linear approximability). CaPLa builds on the empirical observation that a power-law model often serves as a reasonable proxy for piecewise linear-approximability, while explicitly accounting for deviations from a true power-law fit. We prove basic properties of CaPLa and present an efficient algorithm to compute it. We then demonstrate that CaPLa can accurately predict space bounds for data structures on real data. Empirically, we analyze over 500 genomes through the lens of CaPLa, revealing that it varies widely across the tree of life and even within individual genomes. Finally, we study the robustness of CaPLa as a measure and the factors that make genomic k-mer multisets different from random ones.

Cite as

Md. Hasin Abrar, Paul Medvedev, and Giorgio Vinciguerra. Efficiency of Learned Indexes on Genome Spectra. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 18:1-18:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{abrar_et_al:LIPIcs.ESA.2025.18,
  author =	{Abrar, Md. Hasin and Medvedev, Paul and Vinciguerra, Giorgio},
  title =	{{Efficiency of Learned Indexes on Genome Spectra}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{18:1--18:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.18},
  URN =		{urn:nbn:de:0030-drops-244865},
  doi =		{10.4230/LIPIcs.ESA.2025.18},
  annote =	{Keywords: Genome spectra, piecewise linear approximation, learned index, k-mers}
}

Document

DOI: 10.4230/LIPIcs.CP.2025.29

From Prediction to Action: A Constraint-Based Approach to Predictive Policing

Authors: Younes Mechqrane and Ismail Elabbassi

Published in: LIPIcs, Volume 340, 31st International Conference on Principles and Practice of Constraint Programming (CP 2025)

Abstract

Crime prevention in urban environments demands both accurate crime forecasting and the efficient deployment of limited law enforcement resources. In this paper, we present an integrated framework that combines a machine learning module (i.e. PredRNN++ [Wang et al., 2018]) for spatiotemporal crime prediction with a constraint programming module for patrol route optimization. Our approach operates within the ICON loop framework [Bessiere et al., 2017], facilitating iterative refinement of predictions and immediate adaptation of patrol strategies. We validate our method using the City of Chicago Crime Dataset. Experimental results show that routes informed by crime predictions significantly outperform strategies relying solely on historical patterns or operational constraints. These findings illustrate how coupling predictive analytics with constraint programming can substantially enhance resource allocation and overall crime deterrence.

Cite as

Younes Mechqrane and Ismail Elabbassi. From Prediction to Action: A Constraint-Based Approach to Predictive Policing. In 31st International Conference on Principles and Practice of Constraint Programming (CP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 340, pp. 29:1-29:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{mechqrane_et_al:LIPIcs.CP.2025.29,
  author =	{Mechqrane, Younes and Elabbassi, Ismail},
  title =	{{From Prediction to Action: A Constraint-Based Approach to Predictive Policing}},
  booktitle =	{31st International Conference on Principles and Practice of Constraint Programming (CP 2025)},
  pages =	{29:1--29:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-380-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{340},
  editor =	{de la Banda, Maria Garcia},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2025.29},
  URN =		{urn:nbn:de:0030-drops-238902},
  doi =		{10.4230/LIPIcs.CP.2025.29},
  annote =	{Keywords: Inductive Constraint Programming (ICON) Loop, Next Frame Prediction, PredRNN++}
}

Document

DOI: 10.4230/LIPIcs.SAT.2025.32

Efficient Certified Reasoning for Binarized Neural Networks

Authors: Jiong Yang, Yong Kiam Tan, Mate Soos, Magnus O. Myreen, and Kuldeep S. Meel

Published in: LIPIcs, Volume 341, 28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)

Abstract

Neural networks have emerged as essential components in safety-critical applications - these use cases demand complex, yet trustworthy computations. Binarized Neural Networks (BNNs) are a type of neural network where each neuron is constrained to a Boolean value; they are particularly well-suited for safety-critical tasks because they retain much of the computational capacities of full-scale (floating-point or quantized) deep neural networks, but remain compatible with satisfiability solvers for qualitative verification and with model counters for quantitative reasoning. However, existing methods for BNN analysis suffer from either limited scalability or susceptibility to soundness errors, which hinders their applicability in real-world scenarios. In this work, we present a scalable and trustworthy approach for both qualitative and quantitative verification of BNNs. Our approach introduces a native representation of BNN constraints in a custom-designed solver for qualitative reasoning, and in an approximate model counter for quantitative reasoning. We further develop specialized proof generation and checking pipelines with native support for BNN constraint reasoning, ensuring trustworthiness for all of our verification results. Empirical evaluations on a BNN robustness verification benchmark suite demonstrate that our certified solving approach achieves a 9× speedup over prior certified CNF and PB-based approaches, and our certified counting approach achieves a 218× speedup over the existing CNF-based baseline. In terms of coverage, our pipeline produces fully certified results for 99% and 86% of the qualitative and quantitative reasoning queries on BNNs, respectively. This is in sharp contrast to the best existing baselines which can fully certify only 62% and 4% of the queries, respectively.

Cite as

Jiong Yang, Yong Kiam Tan, Mate Soos, Magnus O. Myreen, and Kuldeep S. Meel. Efficient Certified Reasoning for Binarized Neural Networks. In 28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 341, pp. 32:1-32:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{yang_et_al:LIPIcs.SAT.2025.32,
  author =	{Yang, Jiong and Tan, Yong Kiam and Soos, Mate and Myreen, Magnus O. and Meel, Kuldeep S.},
  title =	{{Efficient Certified Reasoning for Binarized Neural Networks}},
  booktitle =	{28th International Conference on Theory and Applications of Satisfiability Testing (SAT 2025)},
  pages =	{32:1--32:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-381-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{341},
  editor =	{Berg, Jeremias and Nordstr\"{o}m, Jakob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SAT.2025.32},
  URN =		{urn:nbn:de:0030-drops-237665},
  doi =		{10.4230/LIPIcs.SAT.2025.32},
  annote =	{Keywords: Neural network verification, proof certification, SAT solving, approximate model counting}
}

Document

DOI: 10.4230/LIPIcs.ECOOP.2025.13

FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation

Authors: Amber Gorzynski and Alastair F. Donaldson

Published in: LIPIcs, Volume 333, 39th European Conference on Object-Oriented Programming (ECOOP 2025)

Abstract

Decompilation is the process of translating compiled code into high-level code. Control flow recovery is a challenging part of the process. "Misdecompilations" can occur, whereby the decompiled code does not accurately represent the semantics of the compiled code, despite it being syntactically valid. This is problematic because it can mislead users who are trying to reason about the program. We present CFG-based program generation: a novel approach to randomised testing that aims to improve the control flow recovery of decompilers. CFG-based program generation involves randomly generating control flow graphs (CFGs) and paths through each graph. Inspired by prior work in the domain of GPU computing, (CFG, path) pairs are "fleshed" into test programs. Each program is decompiled and recompiled. The test oracle verifies whether the actual runtime path through the graph matches the expected path. Any difference in the execution paths after recompilation indicates a possible misdecompilation. A key benefit of this approach is that it is largely independent of the source and target languages in question because it is focused on control flow. The approach is therefore applicable to numerous decompilation settings. The trade-off resulting from the focus on control flow is that misdecompilation bugs that do not relate to control flow (e.g. bugs that involve specific arithmetic operations) are out of scope. We have implemented this approach in FuzzFlesh, an open-source randomised testing tool. FuzzFlesh can be easily configured to target a variety of low-level languages and decompiler toolchains because most of the CFG and path generation process is language-independent. At present, FuzzFlesh supports testing decompilation of Java bytecode, .NET assembly and x86 machine code. In addition to program generation, FuzzFlesh also includes an automated test-case reducer that operates on the CFG rather than the low-level program, which means that it can be applied to any of the target languages. We present a large experimental campaign applying FuzzFlesh to a variety of decompilers, leading to the discovery of 12 previously-unknown bugs across two language formats, six of which have been fixed. We present experiments comparing our generic FuzzFlesh tool to two state-of-the-art decompiler testing tools targeted at specific languages. As expected, the coverage our generic FuzzFlesh tool achieves on a given decompiler is lower than the coverage achieved by a tool specifically designed for the input format of that decompiler. However, due to its focus on control flow, FuzzFlesh is able to cover sections of control flow recovery code that the targeted tools cannot reach, and identify control flow related bugs that the targeted tools miss.

Cite as

Amber Gorzynski and Alastair F. Donaldson. FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation. In 39th European Conference on Object-Oriented Programming (ECOOP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 333, pp. 13:1-13:26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{gorzynski_et_al:LIPIcs.ECOOP.2025.13,
  author =	{Gorzynski, Amber and Donaldson, Alastair F.},
  title =	{{FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation}},
  booktitle =	{39th European Conference on Object-Oriented Programming (ECOOP 2025)},
  pages =	{13:1--13:26},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-373-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{333},
  editor =	{Aldrich, Jonathan and Silva, Alexandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2025.13},
  URN =		{urn:nbn:de:0030-drops-233062},
  doi =		{10.4230/LIPIcs.ECOOP.2025.13},
  annote =	{Keywords: Decompiler, Reverse Engineering, Control Flow, Software Testing, Fuzzing}
}

Document

Invited Talk

DOI: 10.4230/LIPIcs.ICDT.2025.1

The Quest for Faster Join Algorithms (Invited Talk)

Authors: Paraschos Koutris, Shaleen Deep, Austen Fan, and Hangdong Zhao

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

Joins are the cornerstone of relational databases. Surprisingly, even after several decades of research in the systems and theory database community, we still lack an understanding of how to design the fastest possible join algorithm. In this talk, we will present the exciting progress the database theory community has achieved in join algorithms over the last two decades. The talk will revolve around five key ideas fundamentally shaping this research area: tree decompositions, data partitioning, leveraging statistical information, enumeration, and algebraic techniques.

Cite as

Paraschos Koutris, Shaleen Deep, Austen Fan, and Hangdong Zhao. The Quest for Faster Join Algorithms (Invited Talk). In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 1:1-1:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{koutris_et_al:LIPIcs.ICDT.2025.1,
  author =	{Koutris, Paraschos and Deep, Shaleen and Fan, Austen and Zhao, Hangdong},
  title =	{{The Quest for Faster Join Algorithms}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{1:1--1:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.1},
  URN =		{urn:nbn:de:0030-drops-229428},
  doi =		{10.4230/LIPIcs.ICDT.2025.1},
  annote =	{Keywords: Conjunctive Queries, Joins, Tree Decompositions, Enumeration, Semirings}
}

Document

DOI: 10.4230/LIPIcs.ICDT.2025.27

Parallel Query Processing with Heterogeneous Machines

Authors: Simon Frisk and Paraschos Koutris

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)

Abstract

We study the problem of computing a full Conjunctive Query in parallel using p heterogeneous machines. Our computational model is similar to the MPC model, but each machine has its own cost function mapping from the number of bits it receives to a cost. An optimal algorithm should minimize the maximum cost across all machines. We consider algorithms over a single communication round and give a lower bound and matching upper bound for databases where each relation has the same cardinality. We do this for both linear cost functions like in previous work, but also for more general cost functions. For databases with relations of different cardinalities, we also find a lower bound, and give matching upper bounds for specific queries like the cartesian product, the join, the star query, and the triangle query. Our approach is inspired by the HyperCube algorithm, but there are additional challenges involved when machines have heterogeneous cost functions.

Cite as

Simon Frisk and Paraschos Koutris. Parallel Query Processing with Heterogeneous Machines. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 27:1-27:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{frisk_et_al:LIPIcs.ICDT.2025.27,
  author =	{Frisk, Simon and Koutris, Paraschos},
  title =	{{Parallel Query Processing with Heterogeneous Machines}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{27:1--27:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.27},
  URN =		{urn:nbn:de:0030-drops-229683},
  doi =		{10.4230/LIPIcs.ICDT.2025.27},
  annote =	{Keywords: Joins, Massively Parallel Computation, Heterogeneous}
}

Document

DOI: 10.4230/LIPIcs.STACS.2025.49

Cycle Counting Under Local Differential Privacy for Degeneracy-Bounded Graphs

Authors: Quentin Hillebrand, Vorapong Suppakitpaisarn, and Tetsuo Shibuya

Published in: LIPIcs, Volume 327, 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)

Abstract

We propose an algorithm for counting the number of cycles under local differential privacy for degeneracy-bounded input graphs. Numerous studies have focused on counting the number of triangles under the privacy notion, demonstrating that the expected 𝓁₂-error of these algorithms is Ω(n^{1.5}), where n is the number of nodes in the graph. When parameterized by the number of cycles of length four (C₄), the best existing triangle counting algorithm has an error of O(n^{1.5} + √C₄) = O(n²). In this paper, we introduce an algorithm with an expected 𝓁₂-error of O(δ^1.5 n^0.5 + δ^0.5 d_max^0.5 n^0.5), where δ is the degeneracy and d_{max} is the maximum degree of the graph. For degeneracy-bounded graphs (δ ∈ Θ(1)) commonly found in practical social networks, our algorithm achieves an expected 𝓁₂-error of O(d_{max}^{0.5} n^{0.5}) = O(n). Our algorithm’s core idea is a precise count of triangles following a preprocessing step that approximately sorts the degree of all nodes. This approach can be extended to approximate the number of cycles of length k, maintaining a similar 𝓁₂-error, namely O(δ^{(k-2)/2} d_max^0.5 n^{(k-2)/2} + δ^{k/2} n^{(k-2)/2}) or O(d_max^0.5 n^{(k-2)/2}) = O(n^{(k-1)/2}) for degeneracy-bounded graphs.

Cite as

Quentin Hillebrand, Vorapong Suppakitpaisarn, and Tetsuo Shibuya. Cycle Counting Under Local Differential Privacy for Degeneracy-Bounded Graphs. In 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 327, pp. 49:1-49:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{hillebrand_et_al:LIPIcs.STACS.2025.49,
  author =	{Hillebrand, Quentin and Suppakitpaisarn, Vorapong and Shibuya, Tetsuo},
  title =	{{Cycle Counting Under Local Differential Privacy for Degeneracy-Bounded Graphs}},
  booktitle =	{42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)},
  pages =	{49:1--49:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-365-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{327},
  editor =	{Beyersdorff, Olaf and Pilipczuk, Micha{\l} and Pimentel, Elaine and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2025.49},
  URN =		{urn:nbn:de:0030-drops-228748},
  doi =		{10.4230/LIPIcs.STACS.2025.49},
  annote =	{Keywords: Differential privacy, triangle counting, degeneracy, arboricity, graph theory, parameterized accuracy}
}

@InProceedings{hillebrand_et_al:LIPIcs.STACS.2025.49,
  author =	{Hillebrand, Quentin and Suppakitpaisarn, Vorapong and Shibuya, Tetsuo},
  title =	{{Cycle Counting Under Local Differential Privacy for Degeneracy-Bounded Graphs}},
  booktitle =	{42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)},
  pages =	{49:1--49:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-365-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{327},
  editor =	{Beyersdorff, Olaf and Pilipczuk, Micha{\l} and Pimentel, Elaine and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2025.49},
  URN =		{urn:nbn:de:0030-drops-228748},
  doi =		{10.4230/LIPIcs.STACS.2025.49},
  annote =	{Keywords: Differential privacy, triangle counting, degeneracy, arboricity, graph theory, parameterized accuracy}
}

Document

DOI: 10.4230/LIPIcs.ITCS.2025.39

Data Reconstruction: When You See It and When You Don't

Authors: Edith Cohen, Haim Kaplan, Yishay Mansour, Shay Moran, Kobbi Nissim, Uri Stemmer, and Eliad Tsfadia

Published in: LIPIcs, Volume 325, 16th Innovations in Theoretical Computer Science Conference (ITCS 2025)

Abstract

We revisit the fundamental question of formally defining what constitutes a reconstruction attack. While often clear from the context, our exploration reveals that a precise definition is much more nuanced than it appears, to the extent that a single all-encompassing definition may not exist. Thus, we employ a different strategy and aim to "sandwich" the concept of reconstruction attacks by addressing two complementing questions: (i) What conditions guarantee that a given system is protected against such attacks? (ii) Under what circumstances does a given attack clearly indicate that a system is not protected? More specifically, - We introduce a new definitional paradigm - Narcissus Resiliency - to formulate a security definition for protection against reconstruction attacks. This paradigm has a self-referential nature that enables it to circumvent shortcomings of previously studied notions of security. Furthermore, as a side-effect, we demonstrate that Narcissus resiliency captures as special cases multiple well-studied concepts including differential privacy and other security notions of one-way functions and encryption schemes. - We formulate a link between reconstruction attacks and Kolmogorov complexity. This allows us to put forward a criterion for evaluating when such attacks are convincingly successful.

Cite as

Edith Cohen, Haim Kaplan, Yishay Mansour, Shay Moran, Kobbi Nissim, Uri Stemmer, and Eliad Tsfadia. Data Reconstruction: When You See It and When You Don't. In 16th Innovations in Theoretical Computer Science Conference (ITCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 325, pp. 39:1-39:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)

Copy BibTex To Clipboard

@InProceedings{cohen_et_al:LIPIcs.ITCS.2025.39,
  author =	{Cohen, Edith and Kaplan, Haim and Mansour, Yishay and Moran, Shay and Nissim, Kobbi and Stemmer, Uri and Tsfadia, Eliad},
  title =	{{Data Reconstruction: When You See It and When You Don't}},
  booktitle =	{16th Innovations in Theoretical Computer Science Conference (ITCS 2025)},
  pages =	{39:1--39:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-361-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{325},
  editor =	{Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2025.39},
  URN =		{urn:nbn:de:0030-drops-226674},
  doi =		{10.4230/LIPIcs.ITCS.2025.39},
  annote =	{Keywords: differential privacy, reconstruction}
}

Document

DOI: 10.4230/DagSemProc.08131.4

Coreference Resolution in Biomedical Texts: a Machine Learning Approach

Authors: Jian Su, Xiaofeng Yang, Huaqing Hong, Yuka Tateisi, and Jun'ichi Tsujii

Published in: Dagstuhl Seminar Proceedings, Volume 8131, Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives (2008)

Abstract

Motivation: Coreference resolution, the process of identifying different mentions of an entity, is a very important component in a text-mining system. Compared with the work in news articles, the existing study of coreference resolution in biomedical texts is quite preliminary by only focusing on specific types of anaphors like pronouns or definite noun phrases, using heuristic methods, and running on small data sets. Therefore, there is a need for an in-depth exploration of this task in the biomedical domain. Results: In this article, we presented a learning-based approach to coreference resolution in the biomedical domain. We made three contributions in our study. Firstly, we annotated a large scale coreference corpus, MedCo, which consists of 1,999 medline abstracts in the GENIA data set. Secondly, we proposed a detailed framework for the coreference resolution task, in which we augmented the traditional learning model by incorporating non-anaphors into training. Lastly, we explored various sources of knowledge for coreference resolution, particularly, those that can deal with the complexity of biomedical texts. The evaluation on the MedCo corpus showed promising results. Our coreference resolution system achieved a high precision of 85.2% with a reasonable recall of 65.3%, obtaining an F-measure of 73.9%. The results also suggested that our augmented learning model significantly boosted precision (up to 24.0%) without much loss in recall (less than 5%), and brought a gain of over 8% in F-measure.

Cite as

Jian Su, Xiaofeng Yang, Huaqing Hong, Yuka Tateisi, and Jun'ichi Tsujii. Coreference Resolution in Biomedical Texts: a Machine Learning Approach. In Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives. Dagstuhl Seminar Proceedings, Volume 8131, p. 1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

Copy BibTex To Clipboard

@InProceedings{su_et_al:DagSemProc.08131.4,
  author =	{Su, Jian and Yang, Xiaofeng and Hong, Huaqing and Tateisi, Yuka and Tsujii, Jun'ichi},
  title =	{{Coreference Resolution in Biomedical Texts: a Machine Learning Approach}},
  booktitle =	{Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives},
  pages =	{1--1},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8131},
  editor =	{Michael Ashburner and Ulf Leser and Dietrich Rebholz-Schuhmann},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08131.4},
  URN =		{urn:nbn:de:0030-drops-15220},
  doi =		{10.4230/DagSemProc.08131.4},
  annote =	{Keywords: Coreference resolution, biomedical text}
}

11 Search Results for "Yang, Xiaofeng"

An Optimal Algorithm for the Stacker Crane Problem on Fixed Topologies

Abstract

Cite as

GraphRAG on Technical Documents - Impact of Knowledge Graph Schema

Abstract

Cite as

Efficiency of Learned Indexes on Genome Spectra

Abstract

Cite as

From Prediction to Action: A Constraint-Based Approach to Predictive Policing

Abstract

Cite as

Efficient Certified Reasoning for Binarized Neural Networks

Abstract

Cite as

FuzzFlesh: Randomised Testing of Decompilers via Control Flow Graph-Based Program Generation

Abstract

Cite as

The Quest for Faster Join Algorithms (Invited Talk)

Abstract

Cite as

Parallel Query Processing with Heterogeneous Machines

Abstract

Cite as

Cycle Counting Under Local Differential Privacy for Degeneracy-Bounded Graphs

Abstract

Cite as

Data Reconstruction: When You See It and When You Don't

Abstract

Cite as

Coreference Resolution in Biomedical Texts: a Machine Learning Approach

Abstract

Cite as

Thanks for your feedback!

Could not send message