Document

APPROX

**Published in:** LIPIcs, Volume 207, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)

We give a constant factor polynomial time pseudo-approximation algorithm for min-sum clustering with or without outliers. The algorithm is allowed to exclude an arbitrarily small constant fraction of the points. For instance, we show how to compute a solution that clusters 98% of the input data points and pays no more than a constant factor times the optimal solution that clusters 99% of the input data points. More generally, we give the following bicriteria approximation: For any ε > 0, for any instance with n input points and for any positive integer n' ≤ n, we compute in polynomial time a clustering of at least (1-ε) n' points of cost at most a constant factor greater than the optimal cost of clustering n' points. The approximation guarantee grows with 1/(ε). Our results apply to instances of points in real space endowed with squared Euclidean distance, as well as to points in a metric space, where the number of clusters, and also the dimension if relevant, is arbitrary (part of the input, not an absolute constant).

Sandip Banerjee, Rafail Ostrovsky, and Yuval Rabani. Min-Sum Clustering (With Outliers). In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 16:1-16:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{banerjee_et_al:LIPIcs.APPROX/RANDOM.2021.16, author = {Banerjee, Sandip and Ostrovsky, Rafail and Rabani, Yuval}, title = {{Min-Sum Clustering (With Outliers)}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)}, pages = {16:1--16:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-207-5}, ISSN = {1868-8969}, year = {2021}, volume = {207}, editor = {Wootters, Mary and Sanit\`{a}, Laura}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2021.16}, URN = {urn:nbn:de:0030-drops-147093}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2021.16}, annote = {Keywords: Clustering, approximation algorithms, primal-dual} }

Document

**Published in:** LIPIcs, Volume 173, 28th Annual European Symposium on Algorithms (ESA 2020)

In many classic clustering problems, we seek to sketch a massive data set of n points (a.k.a clients) in a metric space, by segmenting them into k categories or clusters, each cluster represented concisely by a single point in the metric space (a.k.a. the cluster’s center or its facility). The goal is to find such a sketch that minimizes some objective that depends on the distances between the clients and their respective facilities (the objective is a.k.a. the service cost). Two notable examples are the k-center/k-supplier problem where the objective is to minimize the maximum distance from any client to its facility, and the k-median problem where the objective is to minimize the sum over all clients of the distance from the client to its facility.
In practical applications of clustering, the data set may evolve over time, reflecting an evolution of the underlying clustering model. Thus, in such applications, a good clustering must simultaneously represent the temporal data set well, but also not change too drastically between time steps. In this paper, we initiate the study of a dynamic version of clustering problems that aims to capture these considerations. In this version there are T time steps, and in each time step t ∈ {1,2,… ,T}, the set of clients needed to be clustered may change, and we can move the k facilities between time steps. The general goal is to minimize certain combinations of the service cost and the facility movement cost, or minimize one subject to some constraints on the other. More specifically, we study two concrete problems in this framework: the Dynamic Ordered k-Median and the Dynamic k-Supplier problem. Our technical contributions are as follows:
- We consider the Dynamic Ordered k-Median problem, where the objective is to minimize the weighted sum of ordered distances over all time steps, plus the total cost of moving the facilities between time steps. We present one constant-factor approximation algorithm for T = 2 and another approximation algorithm for fixed T ≥ 3.
- We consider the Dynamic k-Supplier problem, where the objective is to minimize the maximum distance from any client to its facility, subject to the constraint that between time steps the maximum distance moved by any facility is no more than a given threshold. When the number of time steps T is 2, we present a simple constant factor approximation algorithm and a bi-criteria constant factor approximation algorithm for the outlier version, where some of the clients can be discarded. We also show that it is NP-hard to approximate the problem with any factor for T ≥ 3.

Shichuan Deng, Jian Li, and Yuval Rabani. Approximation Algorithms for Clustering with Dynamic Points. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 37:1-37:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{deng_et_al:LIPIcs.ESA.2020.37, author = {Deng, Shichuan and Li, Jian and Rabani, Yuval}, title = {{Approximation Algorithms for Clustering with Dynamic Points}}, booktitle = {28th Annual European Symposium on Algorithms (ESA 2020)}, pages = {37:1--37:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-162-7}, ISSN = {1868-8969}, year = {2020}, volume = {173}, editor = {Grandoni, Fabrizio and Herman, Grzegorz and Sanders, Peter}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2020.37}, URN = {urn:nbn:de:0030-drops-129037}, doi = {10.4230/LIPIcs.ESA.2020.37}, annote = {Keywords: clustering, dynamic points, multi-objective optimization} }

Document

APPROX

**Published in:** LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

We consider parametrized versions of metrical task systems and metrical service systems, two fundamental models of online computing, where the constrained parameter is the number of possible distinct requests m. Such parametrization occurs naturally in a wide range of applications. Striking examples are certain power management problems, which are modeled as metrical task systems with m = 2. We characterize the competitive ratio in terms of the parameter m for both deterministic and randomized algorithms on hierarchically separated trees. Our findings uncover a rich and unexpected picture that differs substantially from what is known or conjectured about the unparametrized versions of these problems. For metrical task systems, we show that deterministic algorithms do not exhibit any asymptotic gain beyond one-level trees (namely, uniform metric spaces), whereas randomized algorithms do not exhibit any asymptotic gain even for one-level trees. In contrast, the special case of metrical service systems (subset chasing) behaves very differently. Both deterministic and randomized algorithms exhibit gain, for m sufficiently small compared to n, for any number of levels. Most significantly, they exhibit a large gain for uniform metric spaces and a smaller gain for two-level trees. Moreover, it turns out that in these cases (as well as in the case of metrical task systems for uniform metric spaces with m being an absolute constant), deterministic algorithms are essentially as powerful as randomized algorithms. This is surprising and runs counter to the ubiquitous intuition/conjecture that, for most problems that can be modeled as metrical task systems, the randomized competitive ratio is polylogarithmic in the deterministic competitive ratio.

Sébastien Bubeck and Yuval Rabani. Parametrized Metrical Task Systems. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 54:1-54:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{bubeck_et_al:LIPIcs.APPROX/RANDOM.2020.54, author = {Bubeck, S\'{e}bastien and Rabani, Yuval}, title = {{Parametrized Metrical Task Systems}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)}, pages = {54:1--54:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-164-1}, ISSN = {1868-8969}, year = {2020}, volume = {176}, editor = {Byrka, Jaros{\l}aw and Meka, Raghu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.54}, URN = {urn:nbn:de:0030-drops-126573}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2020.54}, annote = {Keywords: online computing, competitive analysis, metrical task systems} }

Document

**Published in:** LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)

Osborne's iteration is a method for balancing n x n matrices which is widely used in linear algebra packages, as balancing preserves eigenvalues and stabilizes their numeral computation. The iteration can be implemented in any norm over R^n, but it is normally used in the L_2 norm. The choice of norm not only affects the desired balance condition, but also defines the iterated balancing step itself.
In this paper we focus on Osborne's iteration in any L_p norm, where p < infty. We design a specific implementation of Osborne's iteration in any L_p norm that converges to a strictly epsilon-balanced matrix in O~(epsilon^{-2}n^{9} K) iterations, where K measures, roughly, the number of bits required to represent the entries of the input matrix.
This is the first result that proves a variant of Osborne's iteration in the L_2 norm (or any L_p norm, p < infty) strictly balances matrices in polynomial time. This is a substantial improvement over our recent result (in SODA 2017) that showed weak balancing in L_p norms. Previously, Schulman and Sinclair (STOC 2015) showed strict balancing of another variant of Osborne's iteration in the L_infty norm. Their result does not imply any bounds on strict balancing in other norms.

Rafail Ostrovsky, Yuval Rabani, and Arman Yousefi. Strictly Balancing Matrices in Polynomial Time Using Osborne's Iteration. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 93:1-93:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{ostrovsky_et_al:LIPIcs.ICALP.2018.93, author = {Ostrovsky, Rafail and Rabani, Yuval and Yousefi, Arman}, title = {{Strictly Balancing Matrices in Polynomial Time Using Osborne's Iteration}}, booktitle = {45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)}, pages = {93:1--93:11}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-076-7}, ISSN = {1868-8969}, year = {2018}, volume = {107}, editor = {Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.93}, URN = {urn:nbn:de:0030-drops-90976}, doi = {10.4230/LIPIcs.ICALP.2018.93}, annote = {Keywords: Numerical Linear Algebra, Optimization} }

Document

**Published in:** LIPIcs, Volume 81, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)

We consider the problem of embedding a finite set of points x_1, ... , x_n in R^d that satisfy l_2^2 triangle inequalities into l_1, when the points are approximately low-dimensional. Goemans (unpublished, appears in a work of Magen and Moharammi (2008) ) showed that such points residing in exactly d dimensions can be embedded into l_1 with distortion at most sqrt{d}. We prove the following robust analogue of this statement: if there exists a r-dimensional subspace Pi such that the projections onto this subspace satisfy sum_{i,j in [n]} norm{Pi x_i - Pi x_j}_2^2 >= Omega(1) * sum_{i,j \in [n]} norm{x_i - x_j}_2^2, then there is an embedding of the points into l_1 with O(sqrt{r}) average distortion. A consequence of this result is that the integrality gap of the well-known Goemans-Linial SDP relaxation for the Uniform Sparsest Cut problem is O(sqrt{r}) on graphs G whose r-th smallest normalized eigenvalue of the Laplacian satisfies lambda_r(G)/n >= Omega(1)*Phi_{SDP}(G). Our result improves upon the previously known bound of O(r) on the average distortion, and the integrality gap of the Goemans-Linial SDP under the same preconditions, proven in [Deshpande and Venkat, 2014], and [Deshpande, Harsha and Venkat 2016].

Yuval Rabani and Rakesh Venkat. Approximating Sparsest Cut in Low Rank Graphs via Embeddings from Approximately Low Dimensional Spaces. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 81, pp. 21:1-21:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{rabani_et_al:LIPIcs.APPROX-RANDOM.2017.21, author = {Rabani, Yuval and Venkat, Rakesh}, title = {{Approximating Sparsest Cut in Low Rank Graphs via Embeddings from Approximately Low Dimensional Spaces}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)}, pages = {21:1--21:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-044-6}, ISSN = {1868-8969}, year = {2017}, volume = {81}, editor = {Jansen, Klaus and Rolim, Jos\'{e} D. P. and Williamson, David P. and Vempala, Santosh S.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2017.21}, URN = {urn:nbn:de:0030-drops-75705}, doi = {10.4230/LIPIcs.APPROX-RANDOM.2017.21}, annote = {Keywords: Metric Embeddings, Sparsest Cut, Negative type metrics, Approximation Algorithms} }

Document

Complete Volume

**Published in:** LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

LIPIcs, Volume 55, ICALP'16, Complete Volume

43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@Proceedings{chatzigiannakis_et_al:LIPIcs.ICALP.2016, title = {{LIPIcs, Volume 55, ICALP'16, Complete Volume}}, booktitle = {43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-013-2}, ISSN = {1868-8969}, year = {2016}, volume = {55}, editor = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016}, URN = {urn:nbn:de:0030-drops-65844}, doi = {10.4230/LIPIcs.ICALP.2016}, annote = {Keywords: Theory of Computation} }

Document

Front Matter

**Published in:** LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)

Front Matter, Table of Contents, Preface, Organization, List of Authors

43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 0:i-0:xliv, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{chatzigiannakis_et_al:LIPIcs.ICALP.2016.0, author = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, title = {{Front Matter, Table of Contents, Preface, Organization, List of Authors}}, booktitle = {43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)}, pages = {0:i--0:xliv}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-013-2}, ISSN = {1868-8969}, year = {2016}, volume = {55}, editor = {Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.0}, URN = {urn:nbn:de:0030-drops-61917}, doi = {10.4230/LIPIcs.ICALP.2016.0}, annote = {Keywords: Front Matter, Table of Contents, Preface, Organization, List of Authors} }

Document

**Published in:** LIPIcs, Volume 18, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2012)

We discuss recent results giving algorithms for learning mixtures of unstructured distributions.

Yuval Rabani. Learning Mixtures of Distributions over Large Discrete Domains. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2012). Leibniz International Proceedings in Informatics (LIPIcs), Volume 18, pp. 1-3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InProceedings{rabani:LIPIcs.FSTTCS.2012.1, author = {Rabani, Yuval}, title = {{Learning Mixtures of Distributions over Large Discrete Domains}}, booktitle = {IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2012)}, pages = {1--3}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-47-7}, ISSN = {1868-8969}, year = {2012}, volume = {18}, editor = {D'Souza, Deepak and Radhakrishnan, Jaikumar and Telikepalli, Kavitha}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2012.1}, URN = {urn:nbn:de:0030-drops-38428}, doi = {10.4230/LIPIcs.FSTTCS.2012.1}, annote = {Keywords: machine learning, mixture models, topic models} }

Document

**Published in:** LIPIcs, Volume 9, 28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011)

This paper studies the ``explanation problem'' for tree- and linearly-ordered array data, a problem motivated by database applications and recently solved for the one-dimensional tree-ordered case. In this paper, one is given a matrix A=(a_{ij}) whose rows and columns have semantics: special subsets of the rows and special subsets of the columns are meaningful, others are not. A submatrix in A is said to be meaningful if and only if it is the cross product of a meaningful row subset and a meaningful column subset, in which case we call it an ``allowed rectangle.'' The goal is to ``explain'' A as a sparse sum of weighted allowed rectangles. Specifically, we wish to find as few weighted allowed rectangles as possible such that, for all i,j, a_ij equals the sum of the weights of all rectangles which include cell (i,j).
In this paper we consider the natural cases in which the matrix dimensions are tree-ordered or linearly-ordered. In the tree-ordered case, we are given a rooted tree $T_1$ whose leaves are the rows of $A$ and another, $T_2$, whose leaves are the columns. Nodes of the trees correspond in an obvious way to the sets of their leaf descendants. In the linearly-ordered case, a set of rows or columns is meaningful if and only if it is contiguous.
For tree-ordered data, we prove the explanation problem NP-Hard and give a randomized $2$-approximation algorithm for it. For linearly-ordered data, we prove the explanation problem NP-Har and give a $2.56$-approximation algorithm. To our knowledge, these are the first results for the problem of sparsely and exactly representing matrices by weighted rectangles.

Howard Karloff, Flip Korn, Konstantin Makarychev, and Yuval Rabani. On Parsimonious Explanations For 2-D Tree- and Linearly-Ordered Data. In 28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011). Leibniz International Proceedings in Informatics (LIPIcs), Volume 9, pp. 332-343, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2011)

Copy BibTex To Clipboard

@InProceedings{karloff_et_al:LIPIcs.STACS.2011.332, author = {Karloff, Howard and Korn, Flip and Makarychev, Konstantin and Rabani, Yuval}, title = {{On Parsimonious Explanations For 2-D Tree- and Linearly-Ordered Data}}, booktitle = {28th International Symposium on Theoretical Aspects of Computer Science (STACS 2011)}, pages = {332--343}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-25-5}, ISSN = {1868-8969}, year = {2011}, volume = {9}, editor = {Schwentick, Thomas and D\"{u}rr, Christoph}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2011.332}, URN = {urn:nbn:de:0030-drops-30246}, doi = {10.4230/LIPIcs.STACS.2011.332}, annote = {Keywords: ordered data, explanation problem} }

Document

**Published in:** Dagstuhl Seminar Reports. Dagstuhl Seminar Reports, Volume 1 (2021)

Yuval Rabani, David Shmoys, and Gerhard Woeginger. Combinatorial Approximation Algorithms (Dagstuhl Seminar 9734). Dagstuhl Seminar Report 187, pp. 1-33, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (1998)

Copy BibTex To Clipboard

@TechReport{rabani_et_al:DagSemRep.187, author = {Rabani, Yuval and Shmoys, David and Woeginger, Gerhard}, title = {{Combinatorial Approximation Algorithms (Dagstuhl Seminar 9734)}}, pages = {1--33}, ISSN = {1619-0203}, year = {1998}, type = {Dagstuhl Seminar Report}, number = {187}, institution = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/DagSemRep.187}, URN = {urn:nbn:de:0030-drops-150746}, doi = {10.4230/DagSemRep.187}, }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail