Document

**Published in:** LIPIcs, Volume 287, 15th Innovations in Theoretical Computer Science Conference (ITCS 2024)

We investigate unbiased high-dimensional mean estimators in differential privacy. We consider differentially private mechanisms whose expected output equals the mean of the input dataset, for every dataset drawn from a fixed bounded domain K in ℝ^d. A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it. In the first part of this paper, we study the optimal error achievable by a Gaussian noise mechanism for a given domain K, when the error is measured in the 𝓁_p norm for some p ≥ 2. We give algorithms that compute the optimal covariance for the Gaussian noise for a given K under suitable assumptions, and prove a number of nice geometric properties of the optimal error. These results generalize the theory of factorization mechanisms from domains K that are symmetric and finite (or, equivalently, symmetric polytopes) to arbitrary bounded domains.
In the second part of the paper we show that Gaussian noise mechanisms achieve nearly optimal error among all private unbiased mean estimation mechanisms in a very strong sense. In particular, for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error as the best Gaussian noise mechanism. We extend this result to local differential privacy, and to approximate differential privacy, but for the latter the error lower bound holds either for a dataset or for a neighboring dataset, and this relaxation is necessary.

Aleksandar Nikolov and Haohua Tang. General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean Estimation. In 15th Innovations in Theoretical Computer Science Conference (ITCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 287, pp. 85:1-85:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{nikolov_et_al:LIPIcs.ITCS.2024.85, author = {Nikolov, Aleksandar and Tang, Haohua}, title = {{General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean Estimation}}, booktitle = {15th Innovations in Theoretical Computer Science Conference (ITCS 2024)}, pages = {85:1--85:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-309-6}, ISSN = {1868-8969}, year = {2024}, volume = {287}, editor = {Guruswami, Venkatesan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2024.85}, URN = {urn:nbn:de:0030-drops-196133}, doi = {10.4230/LIPIcs.ITCS.2024.85}, annote = {Keywords: differential privacy, mean estimation, unbiased estimator, instance optimality} }

Document

**Published in:** LIPIcs, Volume 189, 37th International Symposium on Computational Geometry (SoCG 2021)

A recent series of papers by Andoni, Naor, Nikolov, Razenshteyn, and Waingarten (STOC 2018, FOCS 2018) has given approximate near neighbour search (NNS) data structures for a wide class of distance metrics, including all norms. In particular, these data structures achieve approximation on the order of p for 𝓁_p^d norms with space complexity nearly linear in the dataset size n and polynomial in the dimension d, and query time sub-linear in n and polynomial in d. The main shortcoming is the exponential in d pre-processing time required for their construction.
In this paper, we describe a more direct framework for constructing NNS data structures for general norms. More specifically, we show via an algorithmic reduction that an efficient NNS data structure for a metric ℳ is implied by an efficient average distortion embedding of ℳ into 𝓁₁ or the Euclidean space. In particular, the resulting data structures require only polynomial pre-processing time, as long as the embedding can be computed in polynomial time.
As a concrete instantiation of this framework, we give an NNS data structure for 𝓁_p with efficient pre-processing that matches the approximation factor, space and query complexity of the aforementioned data structure of Andoni et al. On the way, we resolve a question of Naor (Analysis and Geometry in Metric Spaces, 2014) and provide an explicit, efficiently computable embedding of 𝓁_p, for p ≥ 1, into 𝓁₁ with average distortion on the order of p. Furthermore, we also give data structures for Schatten-p spaces with improved space and query complexity, albeit still requiring exponential pre-processing when p ≥ 2. We expect our approach to pave the way for constructing efficient NNS data structures for all norms.

Deepanshu Kush, Aleksandar Nikolov, and Haohua Tang. Near Neighbor Search via Efficient Average Distortion Embeddings. In 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 189, pp. 50:1-50:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{kush_et_al:LIPIcs.SoCG.2021.50, author = {Kush, Deepanshu and Nikolov, Aleksandar and Tang, Haohua}, title = {{Near Neighbor Search via Efficient Average Distortion Embeddings}}, booktitle = {37th International Symposium on Computational Geometry (SoCG 2021)}, pages = {50:1--50:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-184-9}, ISSN = {1868-8969}, year = {2021}, volume = {189}, editor = {Buchin, Kevin and Colin de Verdi\`{e}re, \'{E}ric}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2021.50}, URN = {urn:nbn:de:0030-drops-138490}, doi = {10.4230/LIPIcs.SoCG.2021.50}, annote = {Keywords: Nearest neighbor search, metric space embeddings, average distortion embeddings, locality-sensitive hashing} }

Document

**Published in:** LIPIcs, Volume 173, 28th Annual European Symposium on Algorithms (ESA 2020)

Many problems in computer science and applied mathematics require rounding a vector 𝐰 of fractional values lying in the interval [0,1] to a binary vector 𝐱 so that, for a given matrix 𝐀, 𝐀𝐱 is as close to 𝐀𝐰 as possible. For example, this problem arises in LP rounding algorithms used to approximate NP-hard optimization problems and in the design of uniformly distributed point sets for numerical integration. For a given matrix 𝐀, the worst-case error over all choices of 𝐰 incurred by the best possible rounding is measured by the linear discrepancy of 𝐀, a quantity studied in discrepancy theory, and introduced by Lovasz, Spencer, and Vesztergombi (EJC, 1986).
We initiate the study of the computational complexity of linear discrepancy. Our investigation proceeds in two directions: (1) proving hardness results and (2) finding both exact and approximate algorithms to evaluate the linear discrepancy of certain matrices. For (1), we show that linear discrepancy is NP-hard. Thus we do not expect to find an efficient exact algorithm for the general case. Restricting our attention to matrices with a constant number of rows, we present a poly-time exact algorithm for matrices consisting of a single row and matrices with a constant number of rows and entries of bounded magnitude. We also present an exponential-time approximation algorithm for general matrices, and an algorithm that approximates linear discrepancy to within an exponential factor.

Lily Li and Aleksandar Nikolov. On the Computational Complexity of Linear Discrepancy. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 69:1-69:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.ESA.2020.69, author = {Li, Lily and Nikolov, Aleksandar}, title = {{On the Computational Complexity of Linear Discrepancy}}, booktitle = {28th Annual European Symposium on Algorithms (ESA 2020)}, pages = {69:1--69:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-162-7}, ISSN = {1868-8969}, year = {2020}, volume = {173}, editor = {Grandoni, Fabrizio and Herman, Grzegorz and Sanders, Peter}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2020.69}, URN = {urn:nbn:de:0030-drops-129352}, doi = {10.4230/LIPIcs.ESA.2020.69}, annote = {Keywords: discrepancy theory, linear discrepancy, rounding, NP-hardness} }

Document

**Published in:** LIPIcs, Volume 129, 35th International Symposium on Computational Geometry (SoCG 2019)

In the geometric transportation problem, we are given a collection of points P in d-dimensional Euclidean space, and each point is given a supply of mu(p) units of mass, where mu(p) could be a positive or a negative integer, and the total sum of the supplies is 0. The goal is to find a flow (called a transportation map) that transports mu(p) units from any point p with mu(p) > 0, and transports -mu(p) units into any point p with mu(p) < 0. Moreover, the flow should minimize the total distance traveled by the transported mass. The optimal value is known as the transportation cost, or the Earth Mover’s Distance (from the points with positive supply to those with negative supply). This problem has been widely studied in many fields of computer science: from theoretical work in computational geometry, to applications in computer vision, graphics, and machine learning.
In this work we study approximation algorithms for the geometric transportation problem. We give an algorithm which, for any fixed dimension d, finds a (1+epsilon)-approximate transportation map in time nearly-linear in n, and polynomial in epsilon^{-1} and in the logarithm of the total supply. This is the first approximation scheme for the problem whose running time depends on n as n * polylog(n). Our techniques combine the generalized preconditioning framework of Sherman, which is grounded in continuous optimization, with simple geometric arguments to first reduce the problem to a minimum cost flow problem on a sparse graph, and then to design a good preconditioner for this latter problem.

Andrey Boris Khesin, Aleksandar Nikolov, and Dmitry Paramonov. Preconditioning for the Geometric Transportation Problem. In 35th International Symposium on Computational Geometry (SoCG 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 129, pp. 15:1-15:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{khesin_et_al:LIPIcs.SoCG.2019.15, author = {Khesin, Andrey Boris and Nikolov, Aleksandar and Paramonov, Dmitry}, title = {{Preconditioning for the Geometric Transportation Problem}}, booktitle = {35th International Symposium on Computational Geometry (SoCG 2019)}, pages = {15:1--15:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-104-7}, ISSN = {1868-8969}, year = {2019}, volume = {129}, editor = {Barequet, Gill and Wang, Yusu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2019.15}, URN = {urn:nbn:de:0030-drops-104190}, doi = {10.4230/LIPIcs.SoCG.2019.15}, annote = {Keywords: Earth Mover Distance, Transportation Problem, Minimum Cost Flow} }

Document

**Published in:** LIPIcs, Volume 77, 33rd International Symposium on Computational Geometry (SoCG 2017)

We study the optimal sample complexity of a given workload of linear queries under the constraints of differential privacy. The sample complexity of a query answering mechanism under error parameter alpha is the smallest n such that the mechanism answers the workload with error at most alpha on any database of size n. Following a line of research started by Hardt and Talwar [STOC 2010], we analyze sample complexity using the tools of asymptotic convex geometry. We study the sensitivity polytope, a natural convex body associated with a query workload that quantifies how query answers can change between neighboring databases. This is the information that, roughly speaking, is protected by a differentially private algorithm, and, for this reason, we expect that a "bigger" sensitivity polytope implies larger sample complexity. Our results identify the mean Gaussian width as an appropriate measure of the size of the polytope, and show sample complexity lower bounds in terms of this quantity. Our lower bounds completely characterize the workloads for which the Gaussian noise mechanism is optimal up to constants as those having asymptotically maximal Gaussian width.
Our techniques also yield an alternative proof of Pisier's Volume Number Theorem which also suggests an approach to improving the parameters of the theorem.

Assimakis Kattis and Aleksandar Nikolov. Lower Bounds for Differential Privacy from Gaussian Width. In 33rd International Symposium on Computational Geometry (SoCG 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 77, pp. 45:1-45:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{kattis_et_al:LIPIcs.SoCG.2017.45, author = {Kattis, Assimakis and Nikolov, Aleksandar}, title = {{Lower Bounds for Differential Privacy from Gaussian Width}}, booktitle = {33rd International Symposium on Computational Geometry (SoCG 2017)}, pages = {45:1--45:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-038-5}, ISSN = {1868-8969}, year = {2017}, volume = {77}, editor = {Aronov, Boris and Katz, Matthew J.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2017.45}, URN = {urn:nbn:de:0030-drops-72368}, doi = {10.4230/LIPIcs.SoCG.2017.45}, annote = {Keywords: differential privacy, convex geometry, lower bounds, sample complexity} }

Document

**Published in:** LIPIcs, Volume 60, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)

An important theorem of Banaszczyk (Random Structures & Algorithms 1998) states that for any sequence of vectors of l_2 norm at most 1/5 and any convex body K of Gaussian measure 1/2 in R^n, there exists a signed combination of these vectors which lands inside K. A major open problem is to devise a constructive version of Banaszczyk's vector balancing theorem, i.e. to find an efficient algorithm which constructs the signed combination.
We make progress towards this goal along several fronts. As our first contribution, we show an equivalence between Banaszczyk's theorem and the existence of O(1)-subgaussian distributions over signed combinations. For the case of symmetric convex bodies, our equivalence implies the existence of a universal signing algorithm (i.e. independent of the body), which simply samples from the subgaussian sign distribution and checks to see if the associated combination lands inside the body. For asymmetric convex bodies, we provide a novel recentering procedure, which allows us to reduce to the case where the body is symmetric.
As our second main contribution, we show that the above framework can be efficiently implemented when the vectors have length O(1/sqrt{log n}), recovering Banaszczyk's results under this stronger assumption. More precisely, we use random walk techniques to produce the required O(1)-subgaussian signing distributions when the vectors have length O(1/sqrt{log n}), and use a stochastic gradient ascent method to implement the recentering procedure for asymmetric bodies.

Daniel Dadush, Shashwat Garg, Shachar Lovett, and Aleksandar Nikolov. Towards a Constructive Version of Banaszczyk's Vector Balancing Theorem. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 28:1-28:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{dadush_et_al:LIPIcs.APPROX-RANDOM.2016.28, author = {Dadush, Daniel and Garg, Shashwat and Lovett, Shachar and Nikolov, Aleksandar}, title = {{Towards a Constructive Version of Banaszczyk's Vector Balancing Theorem}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)}, pages = {28:1--28:12}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-018-7}, ISSN = {1868-8969}, year = {2016}, volume = {60}, editor = {Jansen, Klaus and Mathieu, Claire and Rolim, Jos\'{e} D. P. and Umans, Chris}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2016.28}, URN = {urn:nbn:de:0030-drops-66512}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2016.28}, annote = {Keywords: Discrepancy, Vector Balancing, Convex Geometry} }

Document

**Published in:** LIPIcs, Volume 34, 31st International Symposium on Computational Geometry (SoCG 2015)

The gamma_2 norm of a real m by n matrix A is the minimum number t such that the column vectors of A are contained in a 0-centered ellipsoid E that in turn is contained in the hypercube [-t, t]^m. This classical quantity is polynomial-time computable and was proved by the second author and Talwar to approximate the hereditary discrepancy: it bounds the hereditary discrepancy from above and from below, up to logarithmic factors. Here we provided a simplified proof of the upper bound and show that both the upper and the lower bound are asymptotically tight in the worst case.
We then demonstrate on several examples the power of the gamma_2 norm as a tool for proving lower and upper bounds in discrepancy theory. Most notably, we prove a new lower bound of log(n)^(d-1) (up to constant factors) for the d-dimensional Tusnady problem, asking for the combinatorial discrepancy of an n-point set in d-dimensional space with respect to axis-parallel boxes. For d>2, this improves the previous best lower bound, which was of order approximately log(n)^((d-1)/2), and it comes close to the best known upper bound of O(log(n)^(d+1/2)), for which we also obtain a new, very simple proof. Applications to lower bounds for dynamic range searching and lower bounds in differential privacy are given.

Jirí Matoušek and Aleksandar Nikolov. Combinatorial Discrepancy for Boxes via the gamma_2 Norm. In 31st International Symposium on Computational Geometry (SoCG 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 34, pp. 1-15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{matousek_et_al:LIPIcs.SOCG.2015.1, author = {Matou\v{s}ek, Jir{\'\i} and Nikolov, Aleksandar}, title = {{Combinatorial Discrepancy for Boxes via the gamma\underline2 Norm}}, booktitle = {31st International Symposium on Computational Geometry (SoCG 2015)}, pages = {1--15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-83-5}, ISSN = {1868-8969}, year = {2015}, volume = {34}, editor = {Arge, Lars and Pach, J\'{a}nos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SOCG.2015.1}, URN = {urn:nbn:de:0030-drops-51091}, doi = {10.4230/LIPIcs.SOCG.2015.1}, annote = {Keywords: discrepancy theory, range counting, factorization norms} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail