86 Search Results for "Wirth, Anthony"


Volume

LIPIcs, Volume 322

35th International Symposium on Algorithms and Computation (ISAAC 2024)

ISAAC 2024, December 8-11, 2024, Sydney, Australia

Editors: Julián Mestre and Anthony Wirth

Document
A Min-Entropy Approach to Multi-Party Communication Lower Bounds

Authors: Mi-Ying (Miryam) Huang, Xinyu Mao, Shuo Wang, Guangxu Yang, and Jiapeng Zhang

Published in: LIPIcs, Volume 339, 40th Computational Complexity Conference (CCC 2025)


Abstract
Information complexity is one of the most powerful techniques to prove information-theoretical lower bounds, in which Shannon entropy plays a central role. Though Shannon entropy has some convenient properties, such as the chain rule, it still has inherent limitations. One of the most notable barriers is the square-root loss, which appears in the square-root gap between entropy gaps and statistical distances, e.g., Pinsker’s inequality. To bypass this barrier, we introduce a new method based on min-entropy analysis. Building on this new method, we prove the following results. - An Ω(N^{∑_i α_i - max_i {α_i}}/k) randomized communication lower bound of the k-party set-intersection problem where the i-th party holds a random set of size ≈ N^{1-α_i}. - A tight Ω(n/k) randomized lower bound of the k-party Tree Pointer Jumping problems, improving an Ω(n/k²) lower bound by Chakrabarti, Cormode, and McGregor (STOC 08). - An Ω(n/k+√n) lower bound of the Chained Index problem, improving an Ω(n/k²) lower bound by Cormode, Dark, and Konrad (ICALP 19). Since these problems served as hard problems for numerous applications in streaming lower bounds and cryptography, our new lower bounds directly improve these streaming lower bounds and cryptography lower bounds. On the technical side, min-entropy does not have nice properties such as the chain rule. To address this issue, we enhance the structure-vs-pseudorandomness decomposition used by Göös, Pitassi, and Watson (FOCS 17) and Yang and Zhang (STOC 24); both papers used this decomposition to prove communication lower bounds. In this paper, we give a new breath to this method in the multi-party setting, presenting a new toolkit for proving multi-party communication lower bounds.

Cite as

Mi-Ying (Miryam) Huang, Xinyu Mao, Shuo Wang, Guangxu Yang, and Jiapeng Zhang. A Min-Entropy Approach to Multi-Party Communication Lower Bounds. In 40th Computational Complexity Conference (CCC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 339, pp. 33:1-33:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.CCC.2025.33,
  author =	{Huang, Mi-Ying (Miryam) and Mao, Xinyu and Wang, Shuo and Yang, Guangxu and Zhang, Jiapeng},
  title =	{{A Min-Entropy Approach to Multi-Party Communication Lower Bounds}},
  booktitle =	{40th Computational Complexity Conference (CCC 2025)},
  pages =	{33:1--33:29},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-379-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{339},
  editor =	{Srinivasan, Srikanth},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2025.33},
  URN =		{urn:nbn:de:0030-drops-237273},
  doi =		{10.4230/LIPIcs.CCC.2025.33},
  annote =	{Keywords: communication complexity, lifting theorems, set intersection, chained index}
}
Document
Track A: Algorithms, Complexity and Games
Deterministic Independent Sets in the Semi-Streaming Model

Authors: Daniel Ye

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
We consider the independent set problem in the semi-streaming model. For any input graph G = (V, E) with n vertices, an independent set is a set of vertices with no edges between any two elements. In the semi-streaming model, G is presented as a stream of edges and any algorithm must use Õ(n) bits of memory to output a large independent set at the end of the stream. Prior work has designed various semi-streaming algorithms for finding independent sets. Due to the hardness of finding maximum and maximal independent sets in the semi-streaming model, the focus has primarily been on finding independent sets in terms of certain parameters, such as the maximum degree Δ. In particular, there is a simple randomized algorithm that obtains independent sets of size n/(Δ+1) in expectation, which can also be achieved with high probability using more complicated algorithms. For deterministic algorithms, the best bounds are significantly weaker. The best we know is a straightforward algorithm that finds an Ω̃(n/(Δ²)) size independent set. We show that this straightforward algorithm is nearly optimal by proving that any deterministic semi-streaming algorithm can only output an Õ(n/(Δ²)) size independent set. Our result proves a strong separation between the power of deterministic and randomized semi-streaming algorithms for the independent set problem.

Cite as

Daniel Ye. Deterministic Independent Sets in the Semi-Streaming Model. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 135:1-135:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{ye:LIPIcs.ICALP.2025.135,
  author =	{Ye, Daniel},
  title =	{{Deterministic Independent Sets in the Semi-Streaming Model}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{135:1--135:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.135},
  URN =		{urn:nbn:de:0030-drops-235129},
  doi =		{10.4230/LIPIcs.ICALP.2025.135},
  annote =	{Keywords: Sublinear Algorithms, Derandomization, Semi-Streaming Algorithms}
}
Document
Track A: Algorithms, Complexity and Games
Fitting Tree Metrics and Ultrametrics in Data Streams

Authors: Amir Carmel, Debarati Das, Evangelos Kipouridis, and Evangelos Pipis

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
Fitting distances to tree metrics and ultrametrics are two widely used methods in hierarchical clustering, primarily explored within the context of numerical taxonomy. Formally, given a positive distance function D: binom(V,2) → ℝ_{>0}, the goal is to find a tree (or an ultrametric) T including all elements of set V, such that the difference between the distances among vertices in T and those specified by D is minimized. Numerical taxonomy was first introduced by Sneath and Sokal [Nature 1962], and since then it has been studied extensively in both biology and computer science. In this paper, we initiate the study of ultrametric and tree metric fitting problems in the semi-streaming model, where the distances between pairs of elements from V (with |V| = n), defined by the function D, can arrive in an arbitrary order. We study these problems under various distance norms; namely the 𝓁₀ objective, which aims to minimize the number of modified entries in D to fit a tree-metric or an ultrametric; the 𝓁₁ objective, which seeks to minimize the total sum of distance errors across all pairs of points in V; and the 𝓁_∞ objective, which focuses on minimizing the maximum error incurred by any entries in D. - Our first result addresses the 𝓁₀ objective. We provide a single-pass polynomial-time Õ(n)-space O(1) approximation algorithm for ultrametrics and prove that no single-pass exact algorithm exists, even with exponential time. - Next, we show that the algorithm for 𝓁₀ implies an O(Δ/δ) approximation for the 𝓁₁ objective, where Δ is the maximum, and δ is the minimum absolute difference between distances in the input. This bound matches the best-known approximation for the RAM model using a combinatorial algorithm when Δ/δ = O(n). - For the 𝓁_∞ objective, we provide a complete characterization of the ultrametric fitting problem. First, we present a single-pass polynomial-time Õ(n)-space 2-approximation algorithm and show that no better than 2-approximation is possible, even with exponential time. Furthermore, we show that with an additional pass, it is possible to achieve a polynomial-time exact algorithm for ultrametrics. - Finally, we extend all these results to tree metrics by using only one additional pass through the stream and without asymptotically increasing the approximation factor.

Cite as

Amir Carmel, Debarati Das, Evangelos Kipouridis, and Evangelos Pipis. Fitting Tree Metrics and Ultrametrics in Data Streams. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 42:1-42:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{carmel_et_al:LIPIcs.ICALP.2025.42,
  author =	{Carmel, Amir and Das, Debarati and Kipouridis, Evangelos and Pipis, Evangelos},
  title =	{{Fitting Tree Metrics and Ultrametrics in Data Streams}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{42:1--42:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.42},
  URN =		{urn:nbn:de:0030-drops-234197},
  doi =		{10.4230/LIPIcs.ICALP.2025.42},
  annote =	{Keywords: Streaming, Clustering, Ultrametrics, Tree metrics, Distance fitting}
}
Document
Track A: Algorithms, Complexity and Games
Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering

Authors: Nairen Cao, Shi Li, and Jia Ye

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
We revisit the simultaneous approximation model for the correlation clustering problem introduced by Davies, Moseley, and Newman [Davies et al., 2024]. The objective is to find a clustering that minimizes given norms of the disagreement vector over all vertices. We present an efficient algorithm that produces a clustering that is simultaneously a 63.3-approximation for all monotone symmetric norms. This significantly improves upon the previous approximation ratio of 6348 due to Davies, Moseley, and Newman [Davies et al., 2024], which works only for 𝓁_p-norms. To achieve this result, we first reduce the problem to approximating all top-k norms simultaneously, using the connection between monotone symmetric norms and top-k norms established by Chakrabarty and Swamy [Chakrabarty and Swamy, 2019]. Then we develop a novel procedure that constructs a 12.66-approximate fractional clustering for all top-k norms. Our 63.3-approximation ratio is obtained by combining this with the 5-approximate rounding algorithm by Kalhan, Makarychev, and Zhou [Kalhan et al., 2019]. We then demonstrate that with a loss of ε in the approximation ratio, the algorithm can be adapted to run in nearly linear time and in the MPC (massively parallel computation) model with poly-logarithmic number of rounds. By allowing a further trade-off in the approximation ratio to (359+ε), the number of MPC rounds can be reduced to a constant.

Cite as

Nairen Cao, Shi Li, and Jia Ye. Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 40:1-40:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{cao_et_al:LIPIcs.ICALP.2025.40,
  author =	{Cao, Nairen and Li, Shi and Ye, Jia},
  title =	{{Simultaneously Approximating All Norms for Massively Parallel Correlation Clustering}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{40:1--40:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.40},
  URN =		{urn:nbn:de:0030-drops-234171},
  doi =		{10.4230/LIPIcs.ICALP.2025.40},
  annote =	{Keywords: Correlation Clustering, All-Norms, Approximation Algorithm, Massively Parallel Algorithm}
}
Document
Net Occurrences in Fibonacci and Thue-Morse Words

Authors: Peaker Guo and Kaisei Kishi

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
A net occurrence of a repeated string in a text is an occurrence with unique left and right extensions, and the net frequency of the string is the number of its net occurrences in the text. Originally introduced for applications in Natural Language Processing, net frequency has recently gained attention for its algorithmic aspects. Guo et al. [CPM 2024] and Ohlebusch et al. [SPIRE 2024] focus on its computation in the offline setting, while Guo et al. [SPIRE 2024], Inenaga [arXiv 2024], and Mieno and Inenaga [CPM 2025] tackle the online counterpart. Mieno and Inenaga also characterize net occurrences in terms of the minimal unique substrings of the text. Additionally, Guo et al. [CPM 2024] initiate the study of net occurrences in Fibonacci words to establish a lower bound on the asymptotic running time of algorithms. Although there has been notable progress in algorithmic developments and some initial combinatorial insights, the combinatorial aspects of net occurrences have yet to be thoroughly examined. In this work, we make two key contributions. First, we confirm the conjecture that each Fibonacci word contains exactly three net occurrences. Second, we show that each Thue-Morse word contains exactly nine net occurrences. To achieve these results, we introduce the notion of overlapping net occurrence cover, which narrows down the candidate net occurrences in any text. Furthermore, we provide a precise characterization of occurrences of Fibonacci and Thue-Morse words of smaller order, offering structural insights that may have independent interest and potential applications in algorithm analysis and combinatorial properties of these words.

Cite as

Peaker Guo and Kaisei Kishi. Net Occurrences in Fibonacci and Thue-Morse Words. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 16:1-16:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{guo_et_al:LIPIcs.CPM.2025.16,
  author =	{Guo, Peaker and Kishi, Kaisei},
  title =	{{Net Occurrences in Fibonacci and Thue-Morse Words}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{16:1--16:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.16},
  URN =		{urn:nbn:de:0030-drops-231107},
  doi =		{10.4230/LIPIcs.CPM.2025.16},
  annote =	{Keywords: Fibonacci words, Thue-Morse words, net occurrence, net frequency, factorization}
}
Document
Space-Efficient Online Computation of String Net Occurrences

Authors: Takuya Mieno and Shunsuke Inenaga

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
A substring u of a string T is said to be a repeat if u occurs at least twice in T. An occurrence [i..j] of a repeat u in T is said to be a net occurrence if each of the substrings aub = T[i-1..j+1], au = T[i-1..j], and ub = T[i..j+1] occurs exactly once in T. The occurrence [i-1..j+1] of aub is said to be an extended net occurrence of u. Let T be an input string of length n over an alphabet of size σ, and let ENO(T) denote the set of extended net occurrences of repeats in T. Guo et al. [SPIRE 2024] presented an online algorithm which can report ENO(T[1..i]) in T[1..i] in O(nσ²) time, for each prefix T[1..i] of T. Very recently, Inenaga [arXiv 2024] gave a faster online algorithm that can report ENO(T[1..i]) in optimal O(#ENO(T[1..i])) time for each prefix T[1..i] of T, where #S denotes the cardinality of a set S. Both of the aforementioned data structures can be maintained in O(n log σ) time and occupy O(n) space, where the O(n)-space requirement comes from the suffix tree data structure. In particular, Inenaga’s recent algorithm is based on Weiner’s right-to-left online suffix tree construction. In this paper, we show that one can modify Ukkonen’s left-to-right online suffix tree construction algorithm in O(n) space, so that ENO(T[1..i]) can be reported in optimal O(#ENO(T[1..i])) time for each prefix T[1..i] of T. This is an improvement over Guo et al.’s method that is also based on Ukkonen’s algorithm. Further, this leads us to the two following space-efficient alternatives: - A sliding-window algorithm of O(d) working space that can report ENO(T[i-d+1..i]) in optimal O(#ENO(T[i-d+1..i])) time for each sliding window T[i-d+1..i] of size d in T. - A CDAWG-based online algorithm of O(𝖾) working space that can report ENO(T[1..i]) in optimal O(#ENO(T[1..i])) time for each prefix T[1..i] of T, where 𝖾 < 2n is the number of edges in the CDAWG for T. All of our proposed data structures can be maintained in O(n log σ) time for the input online string T. We also discuss that the extended net occurrences of repeats in T can be fully characterized in terms of the minimal unique substrings (MUSs) in T.

Cite as

Takuya Mieno and Shunsuke Inenaga. Space-Efficient Online Computation of String Net Occurrences. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 23:1-23:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{mieno_et_al:LIPIcs.CPM.2025.23,
  author =	{Mieno, Takuya and Inenaga, Shunsuke},
  title =	{{Space-Efficient Online Computation of String Net Occurrences}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{23:1--23:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.23},
  URN =		{urn:nbn:de:0030-drops-231175},
  doi =		{10.4230/LIPIcs.CPM.2025.23},
  annote =	{Keywords: string net occurrences, suffix trees, CDAWGs, maximal repeats, minimal unique substrings (MUSs)}
}
Document
O(1)-Round MPC Algorithms for Multi-Dimensional Grid Graph Connectivity, Euclidean MST and DBSCAN

Authors: Junhao Gan, Anthony Wirth, and Zhuo Zhang

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)


Abstract
In this paper, we investigate three fundamental problems in the Massively Parallel Computation (MPC) model: (i) grid graph connectivity, (ii) approximate Euclidean Minimum Spanning Tree (EMST), and (iii) approximate DBSCAN. Our first result is a O(1)-round Las Vegas (i.e., succeeding with high probability) MPC algorithm for computing the connected components on a d-dimensional c-penetration grid graph ((d,c)-grid graph), where both d and c are positive integer constants. In such a grid graph, each vertex is a point with integer coordinates in ℕ^d, and an edge can only exist between two distinct vertices with 𝓁_∞-norm at most c. To our knowledge, the current best existing result for computing the connected components (CC’s) on (d,c)-grid graphs in the MPC model is to run the state-of-the-art MPC CC algorithms that are designed for general graphs: they achieve O(log log n + log D) [Behnezhad et al., 2019] and O(log log n + log 1/(λ)) [Sepehr Assadi et al., 2019] rounds, respectively, where D is the diameter and λ is the spectral gap of the graph. With our grid graph connectivity technique, our second main result is a O(1)-round Las Vegas MPC algorithm for computing approximate Euclidean MST. The existing state-of-the-art result on this problem is the O(1)-round MPC algorithm proposed by Andoni et al. [Alexandr Andoni et al., 2014], which only guarantees an approximation on the overall weight in expectation. In contrast, our algorithm not only guarantees a deterministic overall weight approximation, but also achieves a deterministic edge-wise weight approximation. The latter property is crucial to many applications, such as finding the Bichromatic Closest Pair and Single-Linkage Clustering. Last, but not least, our third main result is a O(1)-round Las Vegas MPC algorithm for computing an approximate DBSCAN clustering in O(1)-dimensional Euclidean space.

Cite as

Junhao Gan, Anthony Wirth, and Zhuo Zhang. O(1)-Round MPC Algorithms for Multi-Dimensional Grid Graph Connectivity, Euclidean MST and DBSCAN. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 7:1-7:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{gan_et_al:LIPIcs.ICDT.2025.7,
  author =	{Gan, Junhao and Wirth, Anthony and Zhang, Zhuo},
  title =	{{O(1)-Round MPC Algorithms for Multi-Dimensional Grid Graph Connectivity, Euclidean MST and DBSCAN}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{7:1--7:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.7},
  URN =		{urn:nbn:de:0030-drops-229483},
  doi =		{10.4230/LIPIcs.ICDT.2025.7},
  annote =	{Keywords: Massively Parallel Computation, Graph Connectivity, Grid Graphs, Euclidean Minimum Spanning Tree, DBSCAN}
}
Document
Targeted Least Cardinality Candidate Key for Relational Databases

Authors: Vasileios Nakos, Hung Q. Ngo, and Charalampos E. Tsourakakis

Published in: LIPIcs, Volume 328, 28th International Conference on Database Theory (ICDT 2025)


Abstract
Functional dependencies (FDs) are a central theme in databases, playing a major role in the design of database schemas and the optimization of queries [Ramakrishnan and Gehrke, 2003]. In this work, we introduce the targeted least cardinality candidate key problem (TCAND). This problem is defined over a set of functional dependencies ℱ and a target variable set T ⊆ V, and it aims to find the smallest set X ⊆ V such that the FD X → T can be derived from ℱ. The TCAND problem generalizes the well-known NP-hard problem of finding the least cardinality candidate key [Lucchesi and Osborn, 1978], which has been previously demonstrated to be at least as difficult as the set cover problem. We present an integer programming (IP) formulation for the TCAND problem, analogous to a layered set cover problem. We analyze its linear programming (LP) relaxation from two perspectives: we propose two approximation algorithms and investigate the integrality gap. Our findings indicate that the approximation upper bounds for our algorithms are not significantly improvable through LP rounding, a notable distinction from the standard Set Cover problem. Additionally, we discover that a generalization of the TCAND problem is equivalent to a variant of the Set Cover problem, named Red Blue Set Cover [Carr et al., 2000], which cannot be approximated within a sub-polynomial factor in polynomial time under plausible conjectures [Chlamtáč et al., 2023]. Despite the extensive history surrounding the issue of identifying the least cardinality candidate key, our research contributes new theoretical insights, novel algorithms, and demonstrates that the general TCAND problem poses complexities beyond those encountered in the Set Cover problem.

Cite as

Vasileios Nakos, Hung Q. Ngo, and Charalampos E. Tsourakakis. Targeted Least Cardinality Candidate Key for Relational Databases. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 21:1-21:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{nakos_et_al:LIPIcs.ICDT.2025.21,
  author =	{Nakos, Vasileios and Ngo, Hung Q. and Tsourakakis, Charalampos E.},
  title =	{{Targeted Least Cardinality Candidate Key for Relational Databases}},
  booktitle =	{28th International Conference on Database Theory (ICDT 2025)},
  pages =	{21:1--21:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-364-5},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{328},
  editor =	{Roy, Sudeepa and Kara, Ahmet},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.21},
  URN =		{urn:nbn:de:0030-drops-229628},
  doi =		{10.4230/LIPIcs.ICDT.2025.21},
  annote =	{Keywords: functional dependencies, candidate key, approximation algorithms, hardness}
}
Document
Tight Approximation and Kernelization Bounds for Vertex-Disjoint Shortest Paths

Authors: Matthias Bentert, Fedor V. Fomin, and Petr A. Golovach

Published in: LIPIcs, Volume 327, 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)


Abstract
We examine the possibility of approximating Maximum Vertex-Disjoint Shortest Paths. In this problem, the input is an edge-weighted (directed or undirected) n-vertex graph G along with k terminal pairs (s_1,t_1),(s_2,t_2),…,(s_k,t_k). The task is to connect as many terminal pairs as possible by pairwise vertex-disjoint paths such that each path is a shortest path between the respective terminals. Our work is anchored in the recent breakthrough by Lochet [SODA '21], which demonstrates the polynomial-time solvability of the problem for a fixed value of k. Lochet’s result implies the existence of a polynomial-time ck-approximation for Maximum Vertex-Disjoint Shortest Paths, where c ≤ 1 is a constant. (One can guess 1/c terminal pairs to connect in k^O(1/c) time and then utilize Lochet’s algorithm to compute the solution in n^f(1/c) time.) Our first result suggests that this approximation algorithm is, in a sense, the best we can hope for. More precisely, assuming the gap-ETH, we exclude the existence of an o(k)-approximation within f(k) ⋅ poly(n) time for any function f that only depends on k. Our second result demonstrates the infeasibility of achieving an approximation ratio of m^{1/2-ε} in polynomial time, unless P = NP. It is not difficult to show that a greedy algorithm selecting a path with the minimum number of arcs results in a ⌈√𝓁⌉-approximation, where 𝓁 is the number of edges in all the paths of an optimal solution. Since 𝓁 ≤ n, this underscores the tightness of the m^{1/2-ε}-inapproximability bound. Additionally, we establish that Maximum Vertex-Disjoint Shortest Paths is fixed-parameter tractable when parameterized by 𝓁 but does not admit a polynomial kernel. Our hardness results hold for undirected graphs with unit weights, while our positive results extend to scenarios where the input graph is directed and features arbitrary (non-negative) edge weights.

Cite as

Matthias Bentert, Fedor V. Fomin, and Petr A. Golovach. Tight Approximation and Kernelization Bounds for Vertex-Disjoint Shortest Paths. In 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 327, pp. 17:1-17:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{bentert_et_al:LIPIcs.STACS.2025.17,
  author =	{Bentert, Matthias and Fomin, Fedor V. and Golovach, Petr A.},
  title =	{{Tight Approximation and Kernelization Bounds for Vertex-Disjoint Shortest Paths}},
  booktitle =	{42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)},
  pages =	{17:1--17:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-365-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{327},
  editor =	{Beyersdorff, Olaf and Pilipczuk, Micha{\l} and Pimentel, Elaine and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2025.17},
  URN =		{urn:nbn:de:0030-drops-228422},
  doi =		{10.4230/LIPIcs.STACS.2025.17},
  annote =	{Keywords: Inapproximability, Fixed-parameter tractability, Parameterized approximation}
}
Document
A Faster Algorithm for Constrained Correlation Clustering

Authors: Nick Fischer, Evangelos Kipouridis, Jonas Klausen, and Mikkel Thorup

Published in: LIPIcs, Volume 327, 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)


Abstract
In the Correlation Clustering problem we are given n nodes, and a preference for each pair of nodes indicating whether we prefer the two endpoints to be in the same cluster or not. The output is a clustering inducing the minimum number of violated preferences. In certain cases, however, the preference between some pairs may be too important to be violated. The constrained version of this problem specifies pairs of nodes that must be in the same cluster as well as pairs that must not be in the same cluster (hard constraints). The output clustering has to satisfy all hard constraints while minimizing the number of violated preferences. Constrained Correlation Clustering is APX-Hard and has been approximated within a factor 3 by van Zuylen et al. [SODA '07]. Their algorithm is based on rounding an LP with Θ(n³) constraints, resulting in an Ω(n^{3ω}) running time. In this work, using a more combinatorial approach, we show how to approximate this problem significantly faster at the cost of a slightly weaker approximation factor. In particular, our algorithm runs in Õ(n³) time (notice that the input size is Θ(n²)) and approximates Constrained Correlation Clustering within a factor 16. To achieve our result we need properties guaranteed by a particular influential algorithm for (unconstrained) Correlation Clustering, the CC-PIVOT algorithm. This algorithm chooses a pivot node u, creates a cluster containing u and all its preferred nodes, and recursively solves the rest of the problem. It is known that selecting pivots at random gives a 3-approximation. As a byproduct of our work, we provide a derandomization of the CC-PIVOT algorithm that still achieves the 3-approximation; furthermore, we show that there exist instances where no ordering of the pivots can give a (3-ε)-approximation, for any constant ε. Finally, we introduce a node-weighted version of Correlation Clustering, which can be approximated within factor 3 using our insights on Constrained Correlation Clustering. As the general weighted version of Correlation Clustering would require a major breakthrough to approximate within a factor o(log n), Node-Weighted Correlation Clustering may be a practical alternative.

Cite as

Nick Fischer, Evangelos Kipouridis, Jonas Klausen, and Mikkel Thorup. A Faster Algorithm for Constrained Correlation Clustering. In 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 327, pp. 32:1-32:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{fischer_et_al:LIPIcs.STACS.2025.32,
  author =	{Fischer, Nick and Kipouridis, Evangelos and Klausen, Jonas and Thorup, Mikkel},
  title =	{{A Faster Algorithm for Constrained Correlation Clustering}},
  booktitle =	{42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)},
  pages =	{32:1--32:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-365-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{327},
  editor =	{Beyersdorff, Olaf and Pilipczuk, Micha{\l} and Pimentel, Elaine and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2025.32},
  URN =		{urn:nbn:de:0030-drops-228585},
  doi =		{10.4230/LIPIcs.STACS.2025.32},
  annote =	{Keywords: Clustering, Constrained Correlation Clustering, Approximation}
}
Document
Cluster Editing on Cographs and Related Classes

Authors: Manuel Lafond, Alitzel López Sánchez, and Weidong Luo

Published in: LIPIcs, Volume 327, 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)


Abstract
In the Cluster Editing problem, sometimes known as (unweighted) Correlation Clustering, we must insert and delete a minimum number of edges to achieve a graph in which every connected component is a clique. Owing to its applications in computational biology, social network analysis, machine learning, and others, this problem has been widely studied for decades and is still undergoing active research. There exist several parameterized algorithms for general graphs, but little is known about the complexity of the problem on specific classes of graphs. Among the few important results in this direction, if only deletions are allowed, the problem can be solved in polynomial time on cographs, which are the P₄-free graphs. However, the complexity of the broader editing problem on cographs is still open. We show that even on a very restricted subclass of cographs, the problem is NP-hard, W[1]-hard when parameterized by the number p of desired clusters, and that time n^o(p/log p) is forbidden under the ETH. This shows that the editing variant is substantially harder than the deletion-only case, and that hardness holds for the many superclasses of cographs (including graphs of clique-width at most 2, perfect graphs, circle graphs, permutation graphs). On the other hand, we provide an almost tight upper bound of time n^O(p), which is a consequence of a more general n^O(cw⋅p) time algorithm, where cw is the clique-width. Given that forbidding P₄s maintains NP-hardness, we look at {P₄, C₄}-free graphs, also known as trivially perfect graphs, and provide a cubic-time algorithm for this class.

Cite as

Manuel Lafond, Alitzel López Sánchez, and Weidong Luo. Cluster Editing on Cographs and Related Classes. In 42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 327, pp. 64:1-64:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{lafond_et_al:LIPIcs.STACS.2025.64,
  author =	{Lafond, Manuel and L\'{o}pez S\'{a}nchez, Alitzel and Luo, Weidong},
  title =	{{Cluster Editing on Cographs and Related Classes}},
  booktitle =	{42nd International Symposium on Theoretical Aspects of Computer Science (STACS 2025)},
  pages =	{64:1--64:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-365-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{327},
  editor =	{Beyersdorff, Olaf and Pilipczuk, Micha{\l} and Pimentel, Elaine and Thắng, Nguy\~{ê}n Kim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2025.64},
  URN =		{urn:nbn:de:0030-drops-228895},
  doi =		{10.4230/LIPIcs.STACS.2025.64},
  annote =	{Keywords: Cluster editing, cographs, parameterized algorithms, clique-width, trivially perfect graphs}
}
Document
Graph Reconstruction via MIS Queries

Authors: Christian Konrad, Conor O'Sullivan, and Victor Traistaru

Published in: LIPIcs, Volume 325, 16th Innovations in Theoretical Computer Science Conference (ITCS 2025)


Abstract
In the Graph Reconstruction (GR) problem, a player initially only knows the vertex set V of an input graph G = (V, E) and is required to learn its set of edges E. To this end, the player submits queries to an oracle and must deduce E from the oracle’s answers. Angluin and Chen [Journal of Computer and System Sciences, 2008] resolved the number of Independent Set (IS) queries necessary and sufficient for GR on m-edge graphs. In this setting, each query consists of a subset of vertices U ⊆ V, and the oracle responds with a boolean, indicating whether U is an independent set in G. They gave algorithms that use O(m ⋅ log n) IS queries, which is best possible. In this paper, we initiate the study of GR via Maximal Independent Set (MIS) queries, a more powerful variant of IS queries. Given a query U ⊆ V, the oracle responds with any, potentially adversarially chosen, maximal independent set I ⊆ U in the induced subgraph G[U]. We show that, for GR, MIS queries are strictly more powerful than IS queries when parametrized by the maximum degree Δ of the input graph. We give tight (up to poly-logarithmic factors) upper and lower bounds for this problem: 1) We observe that the simple strategy of taking uniform independent random samples of V and submitting those to the oracle yields a non-adaptive randomized algorithm that executes O(Δ² ⋅ log n) queries and succeeds with high probability. This should be contrasted with the fact that Ω(Δ ⋅ n ⋅ log(n/Δ)) IS queries are required for such graphs, which shows that MIS queries are strictly more powerful than IS queries. Interestingly, combining the strategy of taking uniform random samples of V with the probabilistic method, we show the existence of a deterministic non-adaptive algorithm that executes O(Δ³ ⋅ log(n/Δ)) queries. 2) Regarding lower bounds, we prove that the additional Δ factor when going from randomized non-adaptive algorithms to deterministic non-adaptive algorithms is necessary. We show that every non-adaptive deterministic algorithm requires Ω(Δ³ / log² Δ) queries. For arbitrary randomized adaptive algorithms, we show that Ω(Δ²) queries are necessary in graphs of maximum degree Δ, and that Ω(log n) queries are necessary, even when the input graph is an n-vertex cycle.

Cite as

Christian Konrad, Conor O'Sullivan, and Victor Traistaru. Graph Reconstruction via MIS Queries. In 16th Innovations in Theoretical Computer Science Conference (ITCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 325, pp. 66:1-66:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{konrad_et_al:LIPIcs.ITCS.2025.66,
  author =	{Konrad, Christian and O'Sullivan, Conor and Traistaru, Victor},
  title =	{{Graph Reconstruction via MIS Queries}},
  booktitle =	{16th Innovations in Theoretical Computer Science Conference (ITCS 2025)},
  pages =	{66:1--66:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-361-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{325},
  editor =	{Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2025.66},
  URN =		{urn:nbn:de:0030-drops-226945},
  doi =		{10.4230/LIPIcs.ITCS.2025.66},
  annote =	{Keywords: Query Complexity, Graph Reconstruction, Maximal Independent Set Queries}
}
Document
Complete Volume
LIPIcs, Volume 322, ISAAC 2024, Complete Volume

Authors: Julián Mestre and Anthony Wirth

Published in: LIPIcs, Volume 322, 35th International Symposium on Algorithms and Computation (ISAAC 2024)


Abstract
LIPIcs, Volume 322, ISAAC 2024, Complete Volume

Cite as

35th International Symposium on Algorithms and Computation (ISAAC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 322, pp. 1-956, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@Proceedings{mestre_et_al:LIPIcs.ISAAC.2024,
  title =	{{LIPIcs, Volume 322, ISAAC 2024, Complete Volume}},
  booktitle =	{35th International Symposium on Algorithms and Computation (ISAAC 2024)},
  pages =	{1--956},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-354-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{322},
  editor =	{Mestre, Juli\'{a}n and Wirth, Anthony},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2024},
  URN =		{urn:nbn:de:0030-drops-222718},
  doi =		{10.4230/LIPIcs.ISAAC.2024},
  annote =	{Keywords: LIPIcs, Volume 322, ISAAC 2024, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Julián Mestre and Anthony Wirth

Published in: LIPIcs, Volume 322, 35th International Symposium on Algorithms and Computation (ISAAC 2024)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

35th International Symposium on Algorithms and Computation (ISAAC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 322, pp. 0:i-0:xviii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{mestre_et_al:LIPIcs.ISAAC.2024.0,
  author =	{Mestre, Juli\'{a}n and Wirth, Anthony},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{35th International Symposium on Algorithms and Computation (ISAAC 2024)},
  pages =	{0:i--0:xviii},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-354-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{322},
  editor =	{Mestre, Juli\'{a}n and Wirth, Anthony},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2024.0},
  URN =		{urn:nbn:de:0030-drops-222702},
  doi =		{10.4230/LIPIcs.ISAAC.2024.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
  • Refine by Type
  • 85 Document/PDF
  • 12 Document/HTML
  • 1 Volume

  • Refine by Publication Year
  • 12 2025
  • 65 2024
  • 2 2023
  • 1 2022
  • 2 2020
  • Show More...

  • Refine by Author
  • 15 Wirth, Anthony
  • 5 Gan, Junhao
  • 3 Mestre, Julián
  • 3 Staals, Frank
  • 2 Bose, Prosenjit
  • Show More...

  • Refine by Series/Journal
  • 85 LIPIcs

  • Refine by Classification
  • 13 Theory of computation → Computational geometry
  • 12 Theory of computation → Design and analysis of algorithms
  • 8 Theory of computation → Fixed parameter tractability
  • 7 Mathematics of computing → Graph algorithms
  • 6 Theory of computation → Approximation algorithms analysis
  • Show More...

  • Refine by Keyword
  • 4 approximation algorithms
  • 3 Approximation Algorithms
  • 3 Clustering
  • 3 Inapproximability
  • 3 parameterized complexity
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail