11 Search Results for "Razenshteyn, Ilya"


Document
Dimension Reduction for Clustering: The Curious Case of Discrete Centers

Authors: Shaofeng H.-C. Jiang, Robert Krauthgamer, Shay Sapir, Sandeep Silwal, and Di Yue

Published in: LIPIcs, Volume 362, 17th Innovations in Theoretical Computer Science Conference (ITCS 2026)


Abstract
The Johnson-Lindenstrauss transform is a fundamental method for dimension reduction in Euclidean spaces, that can map any dataset of n points into dimension O(log n) with low distortion of their distances. This dimension bound is tight in general, but one can bypass it for specific problems. Indeed, tremendous progress has been made for clustering problems, especially in the continuous setting where centers can be picked from the ambient space ℝ^d. Most notably, for k-median and k-means, the dimension bound was improved to O(log k) [Makarychev, Makarychev and Razenshteyn, STOC 2019]. We explore dimension reduction for clustering in the discrete setting, where centers can only be picked from the dataset, and present two results that are both parameterized by the doubling dimension of the dataset, denoted as ddim. The first result shows that dimension O_{ε}(ddim + log k + log log n) suffices, and is moreover tight, to guarantee that the cost is preserved within factor 1±ε for every set of centers. Our second result eliminates the log log n term in the dimension through a relaxation of the guarantee (namely, preserving the cost only for all approximately-optimal sets of centers), which maintains its usefulness for downstream applications. Overall, we achieve strong dimension reduction in the discrete setting, and find that it differs from the continuous setting not only in the dimension bound, which depends on the doubling dimension, but also in the guarantees beyond preserving the optimal value, such as which clusterings are preserved.

Cite as

Shaofeng H.-C. Jiang, Robert Krauthgamer, Shay Sapir, Sandeep Silwal, and Di Yue. Dimension Reduction for Clustering: The Curious Case of Discrete Centers. In 17th Innovations in Theoretical Computer Science Conference (ITCS 2026). Leibniz International Proceedings in Informatics (LIPIcs), Volume 362, pp. 82:1-82:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2026)


Copy BibTex To Clipboard

@InProceedings{jiang_et_al:LIPIcs.ITCS.2026.82,
  author =	{Jiang, Shaofeng H.-C. and Krauthgamer, Robert and Sapir, Shay and Silwal, Sandeep and Yue, Di},
  title =	{{Dimension Reduction for Clustering: The Curious Case of Discrete Centers}},
  booktitle =	{17th Innovations in Theoretical Computer Science Conference (ITCS 2026)},
  pages =	{82:1--82:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-410-9},
  ISSN =	{1868-8969},
  year =	{2026},
  volume =	{362},
  editor =	{Saraf, Shubhangi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2026.82},
  URN =		{urn:nbn:de:0030-drops-253698},
  doi =		{10.4230/LIPIcs.ITCS.2026.82},
  annote =	{Keywords: dimension reduction, clustering, k-median, k-means, doubling dimension}
}
Document
Clustering in Varying Metrics

Authors: Deeparnab Chakrabarty, Jonathan Conroy, and Ankita Sarkar

Published in: LIPIcs, Volume 360, 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)


Abstract
We introduce the aggregated clustering problem, where one is given T instances of a center-based clustering task over the same n points, but under different metrics. The goal is to open k centers to minimize an aggregate of the clustering costs - e.g., the average or maximum - where the cost is measured via k-center/median/means objectives. More generally, we minimize a norm Ψ over the T cost values. We show that for T ≥ 3, the problem is inapproximable to any finite factor in polynomial time. For T = 2, we give constant-factor approximations. We also show W[2]-hardness when parameterized by k, but obtain f(k,T)poly(n)-time 3-approximations when parameterized by both k and T. When the metrics have structure, we obtain efficient parameterized approximation schemes (EPAS). If all T metrics have bounded ε-scatter dimension, we achieve a (1+ε)-approximation in f(k,T,ε)poly(n) time. If the metrics are induced by edge weights on a common graph G of bounded treewidth tw, and Ψ is the sum function, we get an EPAS in f(T,ε,tw)poly(n,k) time. Conversely, unless (randomized) ETH is false, any finite factor approximation is impossible if parametrized by only T, even when the treewidth is tw = Ω(polylog n).

Cite as

Deeparnab Chakrabarty, Jonathan Conroy, and Ankita Sarkar. Clustering in Varying Metrics. In 45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 360, pp. 19:1-19:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{chakrabarty_et_al:LIPIcs.FSTTCS.2025.19,
  author =	{Chakrabarty, Deeparnab and Conroy, Jonathan and Sarkar, Ankita},
  title =	{{Clustering in Varying Metrics}},
  booktitle =	{45th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2025)},
  pages =	{19:1--19:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-406-2},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{360},
  editor =	{Aiswarya, C. and Mehta, Ruta and Roy, Subhajit},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2025.19},
  URN =		{urn:nbn:de:0030-drops-251007},
  doi =		{10.4230/LIPIcs.FSTTCS.2025.19},
  annote =	{Keywords: Clustering, approximation algorithms, LP rounding, parameterized and exact algorithms, dynamic programming, fixed parameter tractability, hardness of approximation}
}
Document
Streaming Diameter of High-Dimensional Points

Authors: Magnús M. Halldórsson, Nicolaos Matsakis, and Pavel Veselý

Published in: LIPIcs, Volume 351, 33rd Annual European Symposium on Algorithms (ESA 2025)


Abstract
We improve the space bound for streaming approximation of Diameter but also of Farthest Neighbor queries, Minimum Enclosing Ball and its Coreset, in high-dimensional Euclidean spaces. In particular, our deterministic streaming algorithms store 𝒪(ε^{-2}log(1/(ε))) points. This improves by a factor of ε^{-1} the previous space bound of Agarwal and Sharathkumar (SODA 2010), while retaining the state-of-the-art approximation guarantees, such as √2+ε for Diameter or Farthest Neighbor queries, and also offering a simpler and more complete argument. Moreover, we show that storing Ω(ε^{-1}) points is necessary for a streaming (√2+ε)-approximation of Farthest Pair and Farthest Neighbor queries.

Cite as

Magnús M. Halldórsson, Nicolaos Matsakis, and Pavel Veselý. Streaming Diameter of High-Dimensional Points. In 33rd Annual European Symposium on Algorithms (ESA 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 351, pp. 58:1-58:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{halldorsson_et_al:LIPIcs.ESA.2025.58,
  author =	{Halld\'{o}rsson, Magn\'{u}s M. and Matsakis, Nicolaos and Vesel\'{y}, Pavel},
  title =	{{Streaming Diameter of High-Dimensional Points}},
  booktitle =	{33rd Annual European Symposium on Algorithms (ESA 2025)},
  pages =	{58:1--58:10},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-395-9},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{351},
  editor =	{Benoit, Anne and Kaplan, Haim and Wild, Sebastian and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2025.58},
  URN =		{urn:nbn:de:0030-drops-245263},
  doi =		{10.4230/LIPIcs.ESA.2025.58},
  annote =	{Keywords: streaming algorithm, farthest pair, diameter, minimum enclosing ball, coreset}
}
Document
Track A: Algorithms, Complexity and Games
Even Faster Algorithm for the Chamfer Distance

Authors: Ying Feng and Piotr Indyk

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
For two d-dimensional point sets A,B of size up to n, the Chamfer distance from A to B is defined as CH(A,B) = ∑_{a ∈ A} min_{b ∈ B} ‖a-b‖. The Chamfer distance is a widely used measure for quantifying dissimilarity between sets of points, used in many machine learning and computer vision applications. A recent work of Bakshi et al, NeuriPS'23, gave the first near-linear time (1+ε)-approximate algorithm, with a running time of 𝒪(nd log (n)/ε²). In this paper we improve the running time further, to 𝒪(nd(log log n+log1/(ε))/ε²)). When ε is a constant, this reduces the gap between the upper bound and the trivial Ω(dn) lower bound significantly, from 𝒪(log n) to 𝒪(log log n).

Cite as

Ying Feng and Piotr Indyk. Even Faster Algorithm for the Chamfer Distance. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 76:1-76:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{feng_et_al:LIPIcs.ICALP.2025.76,
  author =	{Feng, Ying and Indyk, Piotr},
  title =	{{Even Faster Algorithm for the Chamfer Distance}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{76:1--76:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.76},
  URN =		{urn:nbn:de:0030-drops-234531},
  doi =		{10.4230/LIPIcs.ICALP.2025.76},
  annote =	{Keywords: Chamfer distance}
}
Document
Track A: Algorithms, Complexity and Games
Guessing Efficiently for Constrained Subspace Approximation

Authors: Aditya Bhaskara, Sepideh Mahabadi, Madhusudhan Reddy Pittu, Ali Vakilian, and David P. Woodruff

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
In this paper we study constrained subspace approximation problem. Given a set of n points {a₁,…,a_n} in ℝ^d, the goal of the subspace approximation problem is to find a k dimensional subspace that best approximates the input points. More precisely, for a given p ≥ 1, we aim to minimize the pth power of the 𝓁_p norm of the error vector (‖a₁-Pa₁‖,…,‖a_n-Pa_n‖), where P denotes the projection matrix onto the subspace and the norms are Euclidean. In constrained subspace approximation (CSA), we additionally have constraints on the projection matrix P. In its most general form, we require P to belong to a given subset 𝒮 that is described explicitly or implicitly. We introduce a general framework for constrained subspace approximation. Our approach, that we term coreset-guess-solve, yields either (1+ε)-multiplicative or ε-additive approximations for a variety of constraints. We show that it provides new algorithms for partition-constrained subspace approximation with applications to fair subspace approximation, k-means clustering, and projected non-negative matrix factorization, among others. Specifically, while we reconstruct the best known bounds for k-means clustering in Euclidean spaces, we improve the known results for the remainder of the problems.

Cite as

Aditya Bhaskara, Sepideh Mahabadi, Madhusudhan Reddy Pittu, Ali Vakilian, and David P. Woodruff. Guessing Efficiently for Constrained Subspace Approximation. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 29:1-29:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{bhaskara_et_al:LIPIcs.ICALP.2025.29,
  author =	{Bhaskara, Aditya and Mahabadi, Sepideh and Pittu, Madhusudhan Reddy and Vakilian, Ali and Woodruff, David P.},
  title =	{{Guessing Efficiently for Constrained Subspace Approximation}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{29:1--29:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.29},
  URN =		{urn:nbn:de:0030-drops-234068},
  doi =		{10.4230/LIPIcs.ICALP.2025.29},
  annote =	{Keywords: parameterized complexity, low rank approximation, fairness, non-negative matrix factorization, clustering}
}
Document
Track A: Algorithms, Complexity and Games
Coresets for Robust Clustering via Black-Box Reductions to Vanilla Case

Authors: Shaofeng H.-C. Jiang and Jianing Lou

Published in: LIPIcs, Volume 334, 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)


Abstract
We devise ε-coresets for robust (k,z)-Clustering with m outliers through black-box reductions to vanilla clustering. Given an ε-coreset construction for vanilla clustering with size N, we construct coresets of size N⋅ polylog(kmε^{-1}) + O_z(min{kmε^{-1}, m ε^{-2z}log^z(kmε^{-1})}) for various metric spaces, where O_z hides 2^{O(zlog z)} factors. This increases the size of the vanilla coreset by a small multiplicative factor of polylog(kmε^{-1}), and the additive term is up to a (ε^{-1}log (km))^{O(z)} factor to the size of the optimal robust coreset. Plugging in recent vanilla coreset results of [Cohen-Addad, Saulpic and Schwiegelshohn, STOC'21; Cohen-Addad, Draganov, Russo, Saulpic and Schwiegelshohn, SODA'25], we obtain the first coresets for (k,z)-Clustering with m outliers with size near-linear in k while previous results have size at least Ω(k²) [Huang, Jiang, Lou and Wu, ICLR'23; Huang, Li, Lu and Wu, SODA'25]. Technically, we establish two conditions under which a vanilla coreset is as well a robust coreset. The first condition requires the dataset to satisfy special structures - it can be broken into "dense" parts with bounded diameter. We combine this with a new bounded-diameter decomposition that has only O_z(km ε^{-1}) non-dense points to obtain the O_z(km ε^{-1}) additive bound. Another sufficient condition requires the vanilla coreset to possess an extra size-preserving property. To utilize this condition, we further give a black-box reduction that turns a vanilla coreset to the one that satisfies the said size-preserving property, and this leads to the alternative O_z(mε^{-2z}log^{z}(kmε^{-1})) additive size bound. We also give low-space implementations of our reductions in the dynamic streaming setting. Combined with known streaming constructions for vanilla coresets [Braverman, Frahling, Lang, Sohler and Yang, ICML'17; Hu, Song, Yang and Zhong, arXiv'1802.00459], we obtain the first dynamic streaming algorithms for coresets for k-Median (and k-Means) with m outliers, using space Õ(k + m) ⋅ poly(dε^{-1}log Δ) for inputs on a discrete grid [Δ]^d.

Cite as

Shaofeng H.-C. Jiang and Jianing Lou. Coresets for Robust Clustering via Black-Box Reductions to Vanilla Case. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 101:1-101:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{jiang_et_al:LIPIcs.ICALP.2025.101,
  author =	{Jiang, Shaofeng H.-C. and Lou, Jianing},
  title =	{{Coresets for Robust Clustering via Black-Box Reductions to Vanilla Case}},
  booktitle =	{52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025)},
  pages =	{101:1--101:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-372-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{334},
  editor =	{Censor-Hillel, Keren and Grandoni, Fabrizio and Ouaknine, Jo\"{e}l and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2025.101},
  URN =		{urn:nbn:de:0030-drops-234781},
  doi =		{10.4230/LIPIcs.ICALP.2025.101},
  annote =	{Keywords: Coresets, clustering, outliers, streaming algorithms}
}
Document
Lipschitz Decompositions of Finite 𝓁_{p} Metrics

Authors: Robert Krauthgamer and Nir Petruschka

Published in: LIPIcs, Volume 332, 41st International Symposium on Computational Geometry (SoCG 2025)


Abstract
Lipschitz decomposition is a useful tool in the design of efficient algorithms involving metric spaces. While many bounds are known for different families of finite metrics, the optimal parameters for n-point subsets of 𝓁_p, for p > 2, remained open, see e.g. [Naor, SODA 2017]. We make significant progress on this question and establish the bound β = O(log^{1-1/p} n). Building on prior work, we demonstrate applications of this result to two problems, high-dimensional geometric spanners and distance labeling schemes. In addition, we sharpen a related decomposition bound for 1 < p < 2, due to Filtser and Neiman [Algorithmica 2022].

Cite as

Robert Krauthgamer and Nir Petruschka. Lipschitz Decompositions of Finite 𝓁_{p} Metrics. In 41st International Symposium on Computational Geometry (SoCG 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 332, pp. 66:1-66:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{krauthgamer_et_al:LIPIcs.SoCG.2025.66,
  author =	{Krauthgamer, Robert and Petruschka, Nir},
  title =	{{Lipschitz Decompositions of Finite 𝓁\underline\{p\} Metrics}},
  booktitle =	{41st International Symposium on Computational Geometry (SoCG 2025)},
  pages =	{66:1--66:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-370-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{332},
  editor =	{Aichholzer, Oswin and Wang, Haitao},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2025.66},
  URN =		{urn:nbn:de:0030-drops-232182},
  doi =		{10.4230/LIPIcs.SoCG.2025.66},
  annote =	{Keywords: Lipschitz decompositions, metric embeddings, geometric spanners}
}
Document
Differentially Private High-Dimensional Approximate Range Counting, Revisited

Authors: Martin Aumüller, Fabrizio Boninsegna, and Francesco Silvestri

Published in: LIPIcs, Volume 329, 6th Symposium on Foundations of Responsible Computing (FORC 2025)


Abstract
Locality Sensitive Filters are known for offering a quasi-linear space data structure with rigorous guarantees for the Approximate Near Neighbor search (ANN) problem. Building on Locality Sensitive Filters, we derive a simple data structure for the Approximate Near Neighbor Counting (ANNC) problem under differential privacy (DP). Moreover, we provide a simple analysis leveraging a connection with concomitant statistics and extreme value theory. Our approach produces a simple data structure with a tunable parameter that regulates a trade-off between space-time and utility. Through this trade-off, our data structure achieves the same performance as the recent findings of Andoni et al. (NeurIPS 2023) while offering better utility at the cost of higher space and query time. In addition, we provide a more efficient algorithm under pure ε-DP and elucidate the connection between ANN and differentially private ANNC. As a side result, the paper provides a more compact description and analysis of Locality Sensitive Filters for Fair Near Neighbor Search, improving a previous result in Aumüller et al. (TODS 2022).

Cite as

Martin Aumüller, Fabrizio Boninsegna, and Francesco Silvestri. Differentially Private High-Dimensional Approximate Range Counting, Revisited. In 6th Symposium on Foundations of Responsible Computing (FORC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 329, pp. 15:1-15:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{aumuller_et_al:LIPIcs.FORC.2025.15,
  author =	{Aum\"{u}ller, Martin and Boninsegna, Fabrizio and Silvestri, Francesco},
  title =	{{Differentially Private High-Dimensional Approximate Range Counting, Revisited}},
  booktitle =	{6th Symposium on Foundations of Responsible Computing (FORC 2025)},
  pages =	{15:1--15:24},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-367-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{329},
  editor =	{Bun, Mark},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FORC.2025.15},
  URN =		{urn:nbn:de:0030-drops-231426},
  doi =		{10.4230/LIPIcs.FORC.2025.15},
  annote =	{Keywords: Differential Privacy, Locality Sensitive Filters, Approximate Range Counting, Concominant Statistics}
}
Document
Backdoor Defense, Learnability and Obfuscation

Authors: Paul Christiano, Jacob Hilton, Victor Lecomte, and Mark Xu

Published in: LIPIcs, Volume 325, 16th Innovations in Theoretical Computer Science Conference (ITCS 2025)


Abstract
We introduce a formal notion of defendability against backdoors using a game between an attacker and a defender. In this game, the attacker modifies a function to behave differently on a particular input known as the "trigger", while behaving the same almost everywhere else. The defender then attempts to detect the trigger at evaluation time. If the defender succeeds with high enough probability, then the function class is said to be defendable. The key constraint on the attacker that makes defense possible is that the attacker’s strategy must work for a randomly-chosen trigger. Our definition is simple and does not explicitly mention learning, yet we demonstrate that it is closely connected to learnability. In the computationally unbounded setting, we use a voting algorithm of [Hanneke et al., 2022] to show that defendability is essentially determined by the VC dimension of the function class, in much the same way as PAC learnability. In the computationally bounded setting, we use a similar argument to show that efficient PAC learnability implies efficient defendability, but not conversely. On the other hand, we use indistinguishability obfuscation to show that the class of polynomial size circuits is not efficiently defendable. Finally, we present polynomial size decision trees as a natural example for which defense is strictly easier than learning. Thus, we identify efficient defendability as a notable intermediate concept in between efficient learnability and obfuscation.

Cite as

Paul Christiano, Jacob Hilton, Victor Lecomte, and Mark Xu. Backdoor Defense, Learnability and Obfuscation. In 16th Innovations in Theoretical Computer Science Conference (ITCS 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 325, pp. 38:1-38:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{christiano_et_al:LIPIcs.ITCS.2025.38,
  author =	{Christiano, Paul and Hilton, Jacob and Lecomte, Victor and Xu, Mark},
  title =	{{Backdoor Defense, Learnability and Obfuscation}},
  booktitle =	{16th Innovations in Theoretical Computer Science Conference (ITCS 2025)},
  pages =	{38:1--38:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-361-4},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{325},
  editor =	{Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2025.38},
  URN =		{urn:nbn:de:0030-drops-226662},
  doi =		{10.4230/LIPIcs.ITCS.2025.38},
  annote =	{Keywords: backdoors, machine learning, PAC learning, indistinguishability obfuscation}
}
Document
Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing

Authors: Alexandr Andoni and Ilya Razensteyn

Published in: LIPIcs, Volume 51, 32nd International Symposium on Computational Geometry (SoCG 2016)


Abstract
We prove a tight lower bound for the exponent rho for data-dependent Locality-Sensitive Hashing schemes, recently used to design efficient solutions for the c-approximate nearest neighbor search. In particular, our lower bound matches the bound of rho<= 1/(2c-1)+o(1) for the l_1 space, obtained via the recent algorithm from [Andoni-Razenshteyn, STOC'15]. In recent years it emerged that data-dependent hashing is strictly superior to the classical Locality-Sensitive Hashing, when the hash function is data-independent. In the latter setting, the best exponent has been already known: for the l_1 space, the tight bound is rho=1/c, with the upper bound from [Indyk-Motwani,STOC'98] and the matching lower bound from [O'Donnell-Wu-Zhou,ITCS'11]. We prove that, even if the hashing is data-dependent, it must hold that rho>=1/(2c-1)-o(1). To prove the result, we need to formalize the exact notion of data-dependent hashing that also captures the complexity of the hash functions (in addition to their collision properties). Without restricting such complexity, we would allow for obviously infeasible solutions such as the Voronoi diagram of a dataset. To preclude such solutions, we require our hash functions to be succinct. This condition is satisfied by all the known algorithmic results.

Cite as

Alexandr Andoni and Ilya Razensteyn. Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing. In 32nd International Symposium on Computational Geometry (SoCG 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 51, pp. 9:1-9:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{andoni_et_al:LIPIcs.SoCG.2016.9,
  author =	{Andoni, Alexandr and Razensteyn, Ilya},
  title =	{{Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing}},
  booktitle =	{32nd International Symposium on Computational Geometry (SoCG 2016)},
  pages =	{9:1--9:11},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-009-5},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{51},
  editor =	{Fekete, S\'{a}ndor and Lubiw, Anna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2016.9},
  URN =		{urn:nbn:de:0030-drops-59014},
  doi =		{10.4230/LIPIcs.SoCG.2016.9},
  annote =	{Keywords: similarity search, high-dimensional geometry, LSH, data structures, lower bounds}
}
Document
Restricted Isometry Property for General p-Norms

Authors: Zeyuan Allen-Zhu, Rati Gelashvili, and Ilya Razenshteyn

Published in: LIPIcs, Volume 34, 31st International Symposium on Computational Geometry (SoCG 2015)


Abstract
The Restricted Isometry Property (RIP) is a fundamental property of a matrix which enables sparse recovery. Informally, an m x n matrix satisfies RIP of order k for the L_p norm, if |Ax|_p is approximately |x|_p for every x with at most k non-zero coordinates. For every 1 <= p < infty we obtain almost tight bounds on the minimum number of rows m necessary for the RIP property to hold. Prior to this work, only the cases p = 1, 1 + 1/log(k), and 2 were studied. Interestingly, our results show that the case p=2 is a "singularity" point: the optimal number of rows m is Theta(k^p) for all p in [1, infty)-{2}, as opposed to Theta(k) for k=2. We also obtain almost tight bounds for the column sparsity of RIP matrices and discuss implications of our results for the Stable Sparse Recovery problem.

Cite as

Zeyuan Allen-Zhu, Rati Gelashvili, and Ilya Razenshteyn. Restricted Isometry Property for General p-Norms. In 31st International Symposium on Computational Geometry (SoCG 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 34, pp. 451-460, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)


Copy BibTex To Clipboard

@InProceedings{allenzhu_et_al:LIPIcs.SOCG.2015.451,
  author =	{Allen-Zhu, Zeyuan and Gelashvili, Rati and Razenshteyn, Ilya},
  title =	{{Restricted Isometry Property for General p-Norms}},
  booktitle =	{31st International Symposium on Computational Geometry (SoCG 2015)},
  pages =	{451--460},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-83-5},
  ISSN =	{1868-8969},
  year =	{2015},
  volume =	{34},
  editor =	{Arge, Lars and Pach, J\'{a}nos},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SOCG.2015.451},
  URN =		{urn:nbn:de:0030-drops-51273},
  doi =		{10.4230/LIPIcs.SOCG.2015.451},
  annote =	{Keywords: compressive sensing, dimension reduction, linear algebra, high-dimensional geometry}
}
  • Refine by Type
  • 11 Document/PDF
  • 9 Document/HTML

  • Refine by Publication Year
  • 1 2026
  • 8 2025
  • 1 2016
  • 1 2015

  • Refine by Author
  • 2 Jiang, Shaofeng H.-C.
  • 2 Krauthgamer, Robert
  • 1 Allen-Zhu, Zeyuan
  • 1 Andoni, Alexandr
  • 1 Aumüller, Martin
  • Show More...

  • Refine by Series/Journal
  • 11 LIPIcs

  • Refine by Classification
  • 2 Theory of computation → Facility location and clustering
  • 2 Theory of computation → Streaming, sublinear and near linear time algorithms
  • 1 Computing methodologies → Machine learning
  • 1 Mathematics of computing → Probabilistic algorithms
  • 1 Security and privacy
  • Show More...

  • Refine by Keyword
  • 3 clustering
  • 2 dimension reduction
  • 2 high-dimensional geometry
  • 1 Approximate Range Counting
  • 1 Chamfer distance
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail