168 Search Results for "Pagh, Rasmus"


Volume

LIPIcs, Volume 204

29th Annual European Symposium on Algorithms (ESA 2021)

ESA 2021, September 6-8, 2021, Lisbon, Portugal (Virtual Conference)

Editors: Petra Mutzel, Rasmus Pagh, and Grzegorz Herman

Volume

LIPIcs, Volume 53

15th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2016)

SWAT 2016, June 22-24, 2016, Reykjavik, Iceland

Editors: Rasmus Pagh

Document
Longest Common Substring with Gaps and Related Problems

Authors: Aranya Banerjee, Daniel Gibney, and Sharma V. Thankachan

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
The longest common substring (also known as longest common factor) and longest common subsequence problems are two well-studied classical string problems. The former is solvable in optimal 𝒪(n) time for two strings of length m and n with m ≤ n, and the latter is solvable in 𝒪(nm) time, which is conditionally optimal under the Strong Exponential Time Hypothesis. In this work, we study the problem of longest common factor with gaps, that is, finding a set of at most k matching substrings obeying precedence conditions with maximum total length. For k = 1, this is equivalent to the longest common factor problem, and for k = m, this is equivalent to the longest common subsequence problem. Our work demonstrates that, for constant k, this problem can be solved in strongly subquadratic time, i.e., nm^{1 - Θ(1)}. Motivated by co-linear chaining applications in Computational Biology, we further demonstrate that the longest common factor with gaps results can be extended to the case where the matches are restricted to maximal exact matches (MEMs). To further demonstrate the applicability of our techniques, we show that a similar approach can be used for a restricted version of the episode matching problem where one seeks an ordered set of at most k matches whose concatenation equals a query pattern P and the length of the substring of T containing the matches is minimized. These solutions all run in strongly subquadratic time for constant k.

Cite as

Aranya Banerjee, Daniel Gibney, and Sharma V. Thankachan. Longest Common Substring with Gaps and Related Problems. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 16:1-16:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{banerjee_et_al:LIPIcs.ESA.2024.16,
  author =	{Banerjee, Aranya and Gibney, Daniel and Thankachan, Sharma V.},
  title =	{{Longest Common Substring with Gaps and Related Problems}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{16:1--16:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.16},
  URN =		{urn:nbn:de:0030-drops-210877},
  doi =		{10.4230/LIPIcs.ESA.2024.16},
  annote =	{Keywords: Pattern Matching, Longest Common Subsequence, Episode Matching}
}
Document
A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels

Authors: Jean-Daniel Boissonnat and Kunal Dutta

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
Computing persistent homology of large datasets using Gaussian kernels is useful in the domains of topological data analysis and machine learning as shown by Phillips, Wang and Zheng [SoCG 2015]. However, unlike in the case of persistent homology computation using the Euclidean distance or the k-distance, using Gaussian kernels involves significantly higher overhead, as all distance computations are in terms of the Gaussian kernel distance which is computationally more expensive. Further, most algorithmic implementations (e.g. Gudhi, Ripser, etc.) are based on Euclidean distances, so the question of finding a Euclidean embedding - preferably low-dimensional - that preserves the persistent homology computed with Gaussian kernels, is quite important. We consider the Gaussian kernel power distance (GKPD) given by Phillips, Wang and Zheng. Given an n-point dataset and a relative error parameter {ε} ∈ (0,1], we show that the persistent homology of the {Čech } filtration of the dataset computed using the GKPD can be approximately preserved using O({ε}^{-2}log n) dimensions, under a high stable rank condition. Our results also extend to the Delaunay filtration and the (simpler) case of the weighted Rips filtrations constructed using the GKPD. Compared to the Euclidean embedding for the Gaussian kernel function in ∼ n dimensions, which uses the Cholesky decomposition of the matrix of the kernel function applied to all pairs of data points, our embedding may also be viewed as dimensionality reduction - reducing the dimensionality from n to ∼ log n dimensions. Our proof utilizes the embedding of Chen and Phillips [ALT 2017], based on the Random Fourier Functions of Rahimi and Recht [NeurIPS 2007], together with two novel ingredients. The first one is a new decomposition of the squared radii of {Čech } simplices computed using the GKPD, in terms of the pairwise GKPDs between the vertices, which we state and prove. The second is a new concentration inequality for sums of cosine functions of Gaussian random vectors, which we call Gaussian cosine chaoses. We believe these are of independent interest and will find other applications in future.

Cite as

Jean-Daniel Boissonnat and Kunal Dutta. A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 29:1-29:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{boissonnat_et_al:LIPIcs.ESA.2024.29,
  author =	{Boissonnat, Jean-Daniel and Dutta, Kunal},
  title =	{{A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{29:1--29:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.29},
  URN =		{urn:nbn:de:0030-drops-211009},
  doi =		{10.4230/LIPIcs.ESA.2024.29},
  annote =	{Keywords: Persistent homology, Gaussian kernels, Random Fourier Features, Euclidean embedding}
}
Document
Separable Convex Mixed-Integer Optimization: Improved Algorithms and Lower Bounds

Authors: Cornelius Brand, Martin Koutecký, Alexandra Lassota, and Sebastian Ordyniak

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
We provide several novel algorithms and lower bounds in central settings of mixed-integer (non-)linear optimization, shedding new light on classic results in the field. This includes an improvement on record running time bounds obtained from a slight extension of Lenstra’s 1983 algorithm [Math. Oper. Res. '83] to optimizing under few constraints with small coefficients. This is important for ubiquitous tasks like knapsack-, subset sum- or scheduling problems [Eisenbrand and Weismantel, SODA'18, Jansen and Rohwedder, ITCS'19]. Further, we extend our algorithm to an intermediate linear optimization problem when the matrix has many rows that exhibit 2-stage stochastic structure, which adds to a prominent line of recent results on this and similarly restricted cases [Jansen et al. ICALP'19, Cslovjecsek et al. SODA'21, Brand et al. AAAI'21, Klein, Reuter SODA'22, Cslovjecsek et al. SODA'24]. We also show that the generalization of two fundamental classes of structured constraints from these works (n-fold and 2-stage stochastic programs) to separable-convex mixed-integer optimization are harder than their mixed-integer, linear counterparts. This counters a widespread belief popularized initially by an influential paper of Hochbaum and Shanthikumar, namely that "convex separable optimization is not much harder than linear optimization" [J. ACM '90]. To obtain our algorithms, we employ the mixed Graver basis introduced by Hemmecke [Math. Prog. '03], and our work is the first to give bounds on the norm of its elements. Importantly, we use these bounds differently from how purely-integer Graver bounds are exploited in related approaches, and prove that, surprisingly, this cannot be avoided.

Cite as

Cornelius Brand, Martin Koutecký, Alexandra Lassota, and Sebastian Ordyniak. Separable Convex Mixed-Integer Optimization: Improved Algorithms and Lower Bounds. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 32:1-32:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{brand_et_al:LIPIcs.ESA.2024.32,
  author =	{Brand, Cornelius and Kouteck\'{y}, Martin and Lassota, Alexandra and Ordyniak, Sebastian},
  title =	{{Separable Convex Mixed-Integer Optimization: Improved Algorithms and Lower Bounds}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{32:1--32:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.32},
  URN =		{urn:nbn:de:0030-drops-211033},
  doi =		{10.4230/LIPIcs.ESA.2024.32},
  annote =	{Keywords: Mixed-Integer Programming, Separable Convex Optimization, Parameterized Algorithms, Parameterized Complexity}
}
Document
From Directed Steiner Tree to Directed Polymatroid Steiner Tree in Planar Graphs

Authors: Chandra Chekuri, Rhea Jain, Shubhang Kulkarni, Da Wei Zheng, and Weihao Zhu

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
In the Directed Steiner Tree (DST) problem the input is a directed edge-weighted graph G = (V,E), a root vertex r and a set S ⊆ V of k terminals. The goal is to find a min-cost subgraph that connects r to each of the terminals. DST admits an O(log² k/log log k)-approximation in quasi-polynomial time [Grandoni et al., 2022; Rohan Ghuge and Viswanath Nagarajan, 2022], and an O(k^{ε})-approximation for any fixed ε > 0 in polynomial-time [Alexander Zelikovsky, 1997; Moses Charikar et al., 1999]. Resolving the existence of a polynomial-time poly-logarithmic approximation is a major open problem in approximation algorithms. In a recent work, Friggstad and Mousavi [Zachary Friggstad and Ramin Mousavi, 2023] obtained a simple and elegant polynomial-time O(log k)-approximation for DST in planar digraphs via Thorup’s shortest path separator theorem [Thorup, 2004]. We build on their work and obtain several new results on DST and related problems. - We develop a tree embedding technique for rooted problems in planar digraphs via an interpretation of the recursion in [Zachary Friggstad and Ramin Mousavi, 2023]. Using this we obtain polynomial-time poly-logarithmic approximations for Group Steiner Tree [Naveen Garg et al., 2000], Covering Steiner Tree [Goran Konjevod et al., 2002] and the Polymatroid Steiner Tree [Gruia Călinescu and Alexander Zelikovsky, 2005] problems in planar digraphs. All these problems are hard to approximate to within a factor of Ω(log² n/log log n) even in trees [Eran Halperin and Robert Krauthgamer, 2003; Grandoni et al., 2022]. - We prove that the natural cut-based LP relaxation for DST has an integrality gap of O(log² k) in planar digraphs. This is in contrast to general graphs where the integrality gap of this LP is known to be Ω(√k) [Leonid Zosin and Samir Khuller, 2002] and Ω(n^{δ}) for some fixed δ > 0 [Shi Li and Bundit Laekhanukit, 2022]. - We combine the preceding results with density based arguments to obtain poly-logarithmic approximations for the multi-rooted versions of the problems in planar digraphs. For DST our result improves the O(R + log k) approximation of [Zachary Friggstad and Ramin Mousavi, 2023] when R = ω(log² k).

Cite as

Chandra Chekuri, Rhea Jain, Shubhang Kulkarni, Da Wei Zheng, and Weihao Zhu. From Directed Steiner Tree to Directed Polymatroid Steiner Tree in Planar Graphs. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 42:1-42:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{chekuri_et_al:LIPIcs.ESA.2024.42,
  author =	{Chekuri, Chandra and Jain, Rhea and Kulkarni, Shubhang and Zheng, Da Wei and Zhu, Weihao},
  title =	{{From Directed Steiner Tree to Directed Polymatroid Steiner Tree in Planar Graphs}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{42:1--42:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.42},
  URN =		{urn:nbn:de:0030-drops-211134},
  doi =		{10.4230/LIPIcs.ESA.2024.42},
  annote =	{Keywords: Directed Planar Graphs, Submodular Functions, Steiner Tree, Network Design}
}
Document
Invertible Bloom Lookup Tables with Less Memory and Randomness

Authors: Nils Fleischhacker, Kasper Green Larsen, Maciej Obremski, and Mark Simkin

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
In this work we study Invertible Bloom Lookup Tables (IBLTs) with small failure probabilities. IBLTs are highly versatile data structures that have found applications in set reconciliation protocols, error-correcting codes, and even the design of advanced cryptographic primitives. For storing n elements and ensuring correctness with probability at least 1 - δ, existing IBLT constructions require Ω(n((log(1/δ))/(log n))+1)) space and they crucially rely on fully random hash functions. We present new constructions of IBLTs that are simultaneously more space efficient and require less randomness. For storing n elements with a failure probability of at most δ, our data structure only requires O{n + log(1/δ)log log(1/δ)} space and O{log(log(n)/δ)}-wise independent hash functions. As a key technical ingredient we show that hashing n keys with any k-wise independent hash function h:U → [Cn] for some sufficiently large constant C guarantees with probability 1 - 2^{-Ω(k)} that at least n/2 keys will have a unique hash value. Proving this is non-trivial as k approaches n. We believe that the techniques used to prove this statement may be of independent interest. We apply our new IBLTs to the encrypted compression problem, recently studied by Fleischhacker, Larsen, Simkin (Eurocrypt 2023). We extend their approach to work for a more general class of encryption schemes and using our new IBLT we achieve an asymptotically better compression rate.

Cite as

Nils Fleischhacker, Kasper Green Larsen, Maciej Obremski, and Mark Simkin. Invertible Bloom Lookup Tables with Less Memory and Randomness. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 54:1-54:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{fleischhacker_et_al:LIPIcs.ESA.2024.54,
  author =	{Fleischhacker, Nils and Larsen, Kasper Green and Obremski, Maciej and Simkin, Mark},
  title =	{{Invertible Bloom Lookup Tables with Less Memory and Randomness}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{54:1--54:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.54},
  URN =		{urn:nbn:de:0030-drops-211252},
  doi =		{10.4230/LIPIcs.ESA.2024.54},
  annote =	{Keywords: Invertible Bloom Lookup Tables}
}
Document
A Simple Representation of Tree Covering Utilizing Balanced Parentheses and Efficient Implementation of Average-Case Optimal RMQs

Authors: Kou Hamada, Sankardeep Chakraborty, Seungbum Jo, Takuto Koriyama, Kunihiko Sadakane, and Srinivasa Rao Satti

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
Tree covering is a technique for decomposing a tree into smaller sized trees with desirable properties, and has been employed in various succinct data structures. However, significant hurdles stand in the way of a practical implementation of tree covering: a lot of pointers are used to maintain the tree-covering hierarchy and many indices for tree navigational queries consume theoretically negligible yet practically vast space. To tackle these problems, we propose a simple representation of tree covering using a balanced-parenthesis representation. The key to the proposal is the observation that every micro tree splits into at most two intervals on the BP representation. Utilizing the representation, we propose several data structures that represent a tree and its tree cover, which consequently allow micro tree compression with arbitrary coding and efficient tree navigational queries. We also applied our data structure to average-case optimal RMQ by Munro et al. [ESA 2021] and implemented the RMQ data structure. Our RMQ data structures spend less than 2n bits and process queries in a practical time on several settings of the performance evaluation, reducing the gap between theoretical space complexity and actual space consumption. For example, our implementation consumes 1.822n bits and processes queries in 5µs on average for random queries and in 13µs on average for the worst query widths. We also implement tree navigational operations while using the same amount of space as the RMQ data structures. We believe the representation can be widely utilized for designing practically memory-efficient data structures based on tree covering.

Cite as

Kou Hamada, Sankardeep Chakraborty, Seungbum Jo, Takuto Koriyama, Kunihiko Sadakane, and Srinivasa Rao Satti. A Simple Representation of Tree Covering Utilizing Balanced Parentheses and Efficient Implementation of Average-Case Optimal RMQs. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 64:1-64:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hamada_et_al:LIPIcs.ESA.2024.64,
  author =	{Hamada, Kou and Chakraborty, Sankardeep and Jo, Seungbum and Koriyama, Takuto and Sadakane, Kunihiko and Satti, Srinivasa Rao},
  title =	{{A Simple Representation of Tree Covering Utilizing Balanced Parentheses and Efficient Implementation of Average-Case Optimal RMQs}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{64:1--64:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.64},
  URN =		{urn:nbn:de:0030-drops-211359},
  doi =		{10.4230/LIPIcs.ESA.2024.64},
  annote =	{Keywords: Hypersuccinct trees, Succinct data structures, Range minimum queries, Binary trees}
}
Document
PHOBIC: Perfect Hashing With Optimized Bucket Sizes and Interleaved Coding

Authors: Stefan Hermann, Hans-Peter Lehmann, Giulio Ermanno Pibiri, Peter Sanders, and Stefan Walzer

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
A minimal perfect hash function (or MPHF) maps a set of n keys to [n] : = {1, …, n} without collisions. Such functions find widespread application e.g. in bioinformatics and databases. In this paper we revisit PTHash - a construction technique particularly designed for fast queries. PTHash distributes the input keys into small buckets and, for each bucket, it searches for a hash function seed that places its keys in the output domain without collisions. The collection of all seeds is then stored in a compressed way. Since the first buckets are easier to place, buckets are considered in non-increasing order of size. Additionally, PTHash heuristically produces an imbalanced distribution of bucket sizes by distributing 60% of the keys into 30% of the buckets. Our main contribution is to characterize, up to lower order terms, an optimal choice for the expected bucket sizes, improving construction throughput for space efficient configurations both in theory and practice. Further contributions include a new encoding scheme for seeds that works across partitions of the data structure and a GPU parallelization. Compared to PTHash, PHOBIC is 0.17 bits/key more space efficient for same query time and construction throughput. For a configuration with fast queries, our GPU implementation can construct an MPHF at 2.17 bits/key in 28 ns/key, which can be queried in 37 ns/query on the CPU.

Cite as

Stefan Hermann, Hans-Peter Lehmann, Giulio Ermanno Pibiri, Peter Sanders, and Stefan Walzer. PHOBIC: Perfect Hashing With Optimized Bucket Sizes and Interleaved Coding. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 69:1-69:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{hermann_et_al:LIPIcs.ESA.2024.69,
  author =	{Hermann, Stefan and Lehmann, Hans-Peter and Pibiri, Giulio Ermanno and Sanders, Peter and Walzer, Stefan},
  title =	{{PHOBIC: Perfect Hashing With Optimized Bucket Sizes and Interleaved Coding}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{69:1--69:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.69},
  URN =		{urn:nbn:de:0030-drops-211405},
  doi =		{10.4230/LIPIcs.ESA.2024.69},
  annote =	{Keywords: Compressed Data Structures, Minimal Perfect Hashing, GPU}
}
Document
Improved Space-Efficient Approximate Nearest Neighbor Search Using Function Inversion

Authors: Samuel McCauley

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
Approximate nearest neighbor search (ANN) data structures have widespread applications in machine learning, computational biology, and text processing. The goal of ANN is to preprocess a set S so that, given a query q, we can find a point y whose distance from q approximates the smallest distance from q to any point in S. For most distance functions, the best-known ANN bounds for high-dimensional point sets are obtained using techniques based on locality-sensitive hashing (LSH). Unfortunately, space efficiency is a major challenge for LSH-based data structures. Classic LSH techniques require a very large amount of space, oftentimes polynomial in |S|. A long line of work has developed intricate techniques to reduce this space usage, but these techniques suffer from downsides: they must be hand tailored to each specific LSH, are often complicated, and their space reduction comes at the cost of significantly increased query times. In this paper we explore a new way to improve the space efficiency of LSH using function inversion techniques, originally developed in (Fiat and Naor 2000). We begin by describing how function inversion can be used to improve LSH data structures. This gives a fairly simple, black box method to reduce LSH space usage. Then, we give a data structure that leverages function inversion to improve the query time of the best known near-linear space data structure for approximate nearest neighbor search under Euclidean distance: the ALRW data structure of (Andoni, Laarhoven, Razenshteyn, and Waingarten 2017). ALRW was previously shown to be optimal among "list-of-points" data structures for both Euclidean and Manhattan ANN; thus, in addition to giving improved bounds, our results imply that list-of-points data structures are not optimal for Euclidean or Manhattan ANN .

Cite as

Samuel McCauley. Improved Space-Efficient Approximate Nearest Neighbor Search Using Function Inversion. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 88:1-88:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{mccauley:LIPIcs.ESA.2024.88,
  author =	{McCauley, Samuel},
  title =	{{Improved Space-Efficient Approximate Nearest Neighbor Search Using Function Inversion}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{88:1--88:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.88},
  URN =		{urn:nbn:de:0030-drops-211590},
  doi =		{10.4230/LIPIcs.ESA.2024.88},
  annote =	{Keywords: similarity search, locality-sensitive hashing, randomized algorithms, data structures, space efficiency, function inversion}
}
Document
SubModST: A Fast Generic Solver for Submodular Maximization with Size Constraints

Authors: Henning Martin Woydt, Christian Komusiewicz, and Frank Sommer

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
In the Cardinality-Constrained Maximization (Minimization) problem the input is a universe 𝒰, a function f: 2^{{𝒰}} → ℝ, and an integer k, and the task is to find a set S ⊆ 𝒰 with |S| ≤ k that maximizes (minimizes) f(S). Many well-studied problems such as Facility Location, Partial Dominating Set, Group Closeness Centrality and Euclidean k-Medoid Clustering are special cases of Cardinality-Constrained Maximization (Minimization). All the above-mentioned problems have the diminishing return property, that is, the improvement of adding an element e ∈ 𝒰 to a set S is at least as large as adding e to any superset of S. This property is called submodularity for maximization problems and supermodularity for minimization problems. In this work we develop a new exact branch-and-cut algorithm SubModST for the generic Submodular Cardinality-Constrained Maximization and Supermodular Cardinality-Constrained Minimization. We develop several speed-ups for SubModST and we show their effectiveness on six example problems. We show that SubModST outperforms the state-of-the-art solvers developed by Csókás and Vinkó [J. Glob. Optim. '24] and Uematsu et al. [J. Oper. Res. Soc. Japan '20] for Submodular Cardinality-Constrained Maximization by orders of magnitudes.

Cite as

Henning Martin Woydt, Christian Komusiewicz, and Frank Sommer. SubModST: A Fast Generic Solver for Submodular Maximization with Size Constraints. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 102:1-102:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{woydt_et_al:LIPIcs.ESA.2024.102,
  author =	{Woydt, Henning Martin and Komusiewicz, Christian and Sommer, Frank},
  title =	{{SubModST: A Fast Generic Solver for Submodular Maximization with Size Constraints}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{102:1--102:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.102},
  URN =		{urn:nbn:de:0030-drops-211730},
  doi =		{10.4230/LIPIcs.ESA.2024.102},
  annote =	{Keywords: Branch-and-Cut, Lazy Evaluations, Facility Location, Group Closeness Centrality, Partial Dominating Set}
}
Document
APPROX
Learning-Augmented Maximum Independent Set

Authors: Vladimir Braverman, Prathamesh Dharangutte, Vihan Shah, and Chen Wang

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)


Abstract
We study the Maximum Independent Set (MIS) problem on general graphs within the framework of learning-augmented algorithms. The MIS problem is known to be NP-hard and is also NP-hard to approximate to within a factor of n^(1-δ) for any δ > 0. We show that we can break this barrier in the presence of an oracle obtained through predictions from a machine learning model that answers vertex membership queries for a fixed MIS with probability 1/2+ε. In the first setting we consider, the oracle can be queried once per vertex to know if a vertex belongs to a fixed MIS, and the oracle returns the correct answer with probability 1/2 + ε. Under this setting, we show an algorithm that obtains an Õ((√Δ)/ε)-approximation in O(m) time where Δ is the maximum degree of the graph. In the second setting, we allow multiple queries to the oracle for a vertex, each of which is correct with probability 1/2 + ε. For this setting, we show an O(1)-approximation algorithm using O(n/ε²) total queries and Õ(m) runtime.

Cite as

Vladimir Braverman, Prathamesh Dharangutte, Vihan Shah, and Chen Wang. Learning-Augmented Maximum Independent Set. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 24:1-24:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{braverman_et_al:LIPIcs.APPROX/RANDOM.2024.24,
  author =	{Braverman, Vladimir and Dharangutte, Prathamesh and Shah, Vihan and Wang, Chen},
  title =	{{Learning-Augmented Maximum Independent Set}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{24:1--24:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.24},
  URN =		{urn:nbn:de:0030-drops-210179},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.24},
  annote =	{Keywords: Learning-augmented algorithms, maximum independent set, graph algorithms}
}
Document
APPROX
Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations

Authors: Mridul Nandi, N. V. Vinodchandran, Arijit Ghosh, Kuldeep S. Meel, Soumit Pal, and Sourav Chakraborty

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)


Abstract
Estimating the size of the union of a stream of sets S₁, S₂, …, S_M where each set is a subset of a known universe Ω is a fundamental problem in data streaming. This problem naturally generalizes the well-studied 𝖥₀ estimation problem in the streaming literature, where each set contains a single element from the universe. We consider the general case when the sets S_i can be succinctly represented and allow efficient membership, cardinality, and sampling queries (called a Delphic family of sets). A notable example in this framework is the Klee’s Measure Problem (KMP), where every set S_i is an axis-parallel rectangle in d-dimensional spaces (Ω = [Δ]^d where [Δ] := {1, … ,Δ} and Δ ∈ ℕ). Recently, Meel, Chakraborty, and Vinodchandran (PODS-21, PODS-22) designed a streaming algorithm for (ε,δ)-estimation of the size of the union of set streams over Delphic family with space and update time complexity O((log³|Ω|)/ε² ⋅ log 1/δ) and Õ((log⁴|Ω|)/ε² ⋅ log 1/(δ)), respectively. This work presents a new, sampling-based algorithm for estimating the size of the union of Delphic sets that has space and update time complexity Õ((log²|Ω|)/ε² ⋅ log 1/(δ)). This improves the space complexity bound by a log|Ω| factor and update time complexity bound by a log² |Ω| factor. A critical question is whether quadratic dependence of log|Ω| on space and update time complexities is necessary. Specifically, can we design a streaming algorithm for estimating the size of the union of sets over Delphic family with space and complexity linear in log|Ω| and update time poly(log|Ω|)? While this appears technically challenging, we show that establishing a lower bound of ω(log|Ω|) with poly(log|Ω|) update time is beyond the reach of current techniques. Specifically, we show that under certain hard-to-prove computational complexity hypothesis, there is a streaming algorithm for the problem with optimal space complexity O(log|Ω|) and update time poly(log(|Ω|)). Thus, establishing a space lower bound of ω(log|Ω|) will lead to break-through complexity class separation results.

Cite as

Mridul Nandi, N. V. Vinodchandran, Arijit Ghosh, Kuldeep S. Meel, Soumit Pal, and Sourav Chakraborty. Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 26:1-26:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{nandi_et_al:LIPIcs.APPROX/RANDOM.2024.26,
  author =	{Nandi, Mridul and Vinodchandran, N. V. and Ghosh, Arijit and Meel, Kuldeep S. and Pal, Soumit and Chakraborty, Sourav},
  title =	{{Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{26:1--26:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.26},
  URN =		{urn:nbn:de:0030-drops-210191},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.26},
  annote =	{Keywords: Sampling, Streaming, Klee’s Measure Problem}
}
Document
RANDOM
Matrix Multiplication Reductions

Authors: Ashish Gola, Igor Shinkar, and Harsimran Singh

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)


Abstract
In this paper we study a worst case to average case reduction for the problem of matrix multiplication over finite fields. Suppose we have an efficient average case algorithm, that given two random matrices A,B outputs a matrix that has a non-trivial correlation with their product A ⋅ B. Can we transform it into a worst case algorithm, that outputs the correct answer for all inputs without incurring a significant overhead in the running time? We present two results in this direction. - Two-sided error in the high agreement regime. We begin with a brief remark about a reduction for high agreement algorithms, i.e., an algorithm which agrees with the correct output on a large (say > 0.9) fraction of entries, and show that the standard self-correction of linearity allows us to transform such algorithms into algorithms that work in worst case. - One-sided error in the low agreement regime. Focusing on average case algorithms with one-sided error, we show that over 𝔽₂ there is a reduction that gets an O(T) time average case algorithm that given a random input A,B outputs a matrix that agrees with A ⋅ B on at least 51% of the entries (i.e., has only a slight advantage over the trivial algorithm), and transforms it into an Õ(T) time worst case algorithm, that outputs the correct answer for all inputs with high probability.

Cite as

Ashish Gola, Igor Shinkar, and Harsimran Singh. Matrix Multiplication Reductions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 34:1-34:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{gola_et_al:LIPIcs.APPROX/RANDOM.2024.34,
  author =	{Gola, Ashish and Shinkar, Igor and Singh, Harsimran},
  title =	{{Matrix Multiplication Reductions}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{34:1--34:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.34},
  URN =		{urn:nbn:de:0030-drops-210274},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.34},
  annote =	{Keywords: Matrix Multiplication, Reductions, Worst case to average case reductions}
}
Document
RANDOM
Private Counting of Distinct Elements in the Turnstile Model and Extensions

Authors: Monika Henzinger, A. R. Sricharan, and Teresa Anna Steiner

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)


Abstract
Privately counting distinct elements in a stream is a fundamental data analysis problem with many applications in machine learning. In the turnstile model, Jain et al. [NeurIPS2023] initiated the study of this problem parameterized by the maximum flippancy of any element, i.e., the number of times that the count of an element changes from 0 to above 0 or vice versa. They give an item-level (ε,δ)-differentially private algorithm whose additive error is tight with respect to that parameterization. In this work, we show that a very simple algorithm based on the sparse vector technique achieves a tight additive error for item-level (ε,δ)-differential privacy and item-level ε-differential privacy with regards to a different parameterization, namely the sum of all flippancies. Our second result is a bound which shows that for a large class of algorithms, including all existing differentially private algorithms for this problem, the lower bound from item-level differential privacy extends to event-level differential privacy. This partially answers an open question by Jain et al. [NeurIPS2023].

Cite as

Monika Henzinger, A. R. Sricharan, and Teresa Anna Steiner. Private Counting of Distinct Elements in the Turnstile Model and Extensions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 40:1-40:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{henzinger_et_al:LIPIcs.APPROX/RANDOM.2024.40,
  author =	{Henzinger, Monika and Sricharan, A. R. and Steiner, Teresa Anna},
  title =	{{Private Counting of Distinct Elements in the Turnstile Model and Extensions}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{40:1--40:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.40},
  URN =		{urn:nbn:de:0030-drops-210335},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.40},
  annote =	{Keywords: differential privacy, turnstile model, counting distinct elements}
}
  • Refine by Author
  • 15 Pagh, Rasmus
  • 4 Friedrich, Tobias
  • 3 Bläsius, Thomas
  • 3 Chan, Timothy M.
  • 3 Maheshwari, Anil
  • Show More...

  • Refine by Classification
  • 17 Theory of computation → Computational geometry
  • 17 Theory of computation → Design and analysis of algorithms
  • 14 Theory of computation → Data structures design and analysis
  • 12 Mathematics of computing → Graph algorithms
  • 10 Theory of computation → Graph algorithms analysis
  • Show More...

  • Refine by Keyword
  • 6 approximation
  • 4 approximation algorithms
  • 4 fixed-parameter tractability
  • 4 locality-sensitive hashing
  • 3 Facility Location
  • Show More...

  • Refine by Type
  • 166 document
  • 2 volume

  • Refine by Publication Year
  • 87 2021
  • 36 2016
  • 34 2024
  • 3 2017
  • 3 2020
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail