Search Results

Documents authored by Balcan, Maria-Florina


Document
Improved Sample Complexity Bounds for Branch-And-Cut

Authors: Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, and Ellen Vitercik

Published in: LIPIcs, Volume 235, 28th International Conference on Principles and Practice of Constraint Programming (CP 2022)


Abstract
The branch-and-cut algorithm for integer programming has a wide variety of tunable parameters that have a huge impact on its performance, but which are challenging to tune by hand. An increasingly popular approach is to use machine learning to configure these parameters based on a training set of integer programs from the application domain. We bound how large the training set should be to ensure that for any configuration, its average performance over the training set is close to its expected future performance. Our guarantees apply to parameters that control the most important aspects of branch-and-cut: node selection, branching constraint selection, and cut selection, and are sharper and more general than those from prior research.

Cite as

Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, and Ellen Vitercik. Improved Sample Complexity Bounds for Branch-And-Cut. In 28th International Conference on Principles and Practice of Constraint Programming (CP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 235, pp. 3:1-3:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{balcan_et_al:LIPIcs.CP.2022.3,
  author =	{Balcan, Maria-Florina and Prasad, Siddharth and Sandholm, Tuomas and Vitercik, Ellen},
  title =	{{Improved Sample Complexity Bounds for Branch-And-Cut}},
  booktitle =	{28th International Conference on Principles and Practice of Constraint Programming (CP 2022)},
  pages =	{3:1--3:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-240-2},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{235},
  editor =	{Solnon, Christine},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CP.2022.3},
  URN =		{urn:nbn:de:0030-drops-166321},
  doi =		{10.4230/LIPIcs.CP.2022.3},
  annote =	{Keywords: Automated algorithm configuration, integer programming, machine learning theory, tree search, branch-and-bound, branch-and-cut, cutting planes, sample complexity, generalization guarantees, data-driven algorithm design}
}
Document
Invited Talk
Generalization Guarantees for Data-Driven Mechanism Design (Invited Talk)

Authors: Maria-Florina Balcan

Published in: LIPIcs, Volume 219, 39th International Symposium on Theoretical Aspects of Computer Science (STACS 2022)


Abstract
Many mechanisms including pricing mechanisms and auctions typically come with a variety of tunable parameters which impact significantly their desired performance guarantees. Data-driven mechanism design is a powerful approach for designing mechanisms, where these parameters are tuned via machine learning based on data. In this talk I will discuss how techniques from machine learning theory can be adapted and extended to analyze generalization guarantees of data-driven mechanism design.

Cite as

Maria-Florina Balcan. Generalization Guarantees for Data-Driven Mechanism Design (Invited Talk). In 39th International Symposium on Theoretical Aspects of Computer Science (STACS 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 219, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{balcan:LIPIcs.STACS.2022.2,
  author =	{Balcan, Maria-Florina},
  title =	{{Generalization Guarantees for Data-Driven Mechanism Design}},
  booktitle =	{39th International Symposium on Theoretical Aspects of Computer Science (STACS 2022)},
  pages =	{2:1--2:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-222-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{219},
  editor =	{Berenbrink, Petra and Monmege, Benjamin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2022.2},
  URN =		{urn:nbn:de:0030-drops-158127},
  doi =		{10.4230/LIPIcs.STACS.2022.2},
  annote =	{Keywords: mechanism configuration, algorithm configuration, machine learning, generalization guarantees}
}
Document
Track A: Algorithms, Complexity and Games
Robust Communication-Optimal Distributed Clustering Algorithms

Authors: Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, and David P. Woodruff

Published in: LIPIcs, Volume 132, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)


Abstract
In this work, we study the k-median and k-means clustering problems when the data is distributed across many servers and can contain outliers. While there has been a lot of work on these problems for worst-case instances, we focus on gaining a finer understanding through the lens of beyond worst-case analysis. Our main motivation is the following: for many applications such as clustering proteins by function or clustering communities in a social network, there is some unknown target clustering, and the hope is that running a k-median or k-means algorithm will produce clusterings which are close to matching the target clustering. Worst-case results can guarantee constant factor approximations to the optimal k-median or k-means objective value, but not closeness to the target clustering. Our first result is a distributed algorithm which returns a near-optimal clustering assuming a natural notion of stability, namely, approximation stability [Awasthi and Balcan, 2014], even when a constant fraction of the data are outliers. The communication complexity is O~(sk+z) where s is the number of machines, k is the number of clusters, and z is the number of outliers. Next, we show this amount of communication cannot be improved even in the setting when the input satisfies various non-worst-case assumptions. We give a matching Omega(sk+z) lower bound on the communication required both for approximating the optimal k-means or k-median cost up to any constant, and for returning a clustering that is close to the target clustering in Hamming distance. These lower bounds hold even when the data satisfies approximation stability or other common notions of stability, and the cluster sizes are balanced. Therefore, Omega(sk+z) is a communication bottleneck, even for real-world instances.

Cite as

Pranjal Awasthi, Ainesh Bakshi, Maria-Florina Balcan, Colin White, and David P. Woodruff. Robust Communication-Optimal Distributed Clustering Algorithms. In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 18:1-18:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{awasthi_et_al:LIPIcs.ICALP.2019.18,
  author =	{Awasthi, Pranjal and Bakshi, Ainesh and Balcan, Maria-Florina and White, Colin and Woodruff, David P.},
  title =	{{Robust Communication-Optimal Distributed Clustering Algorithms}},
  booktitle =	{46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)},
  pages =	{18:1--18:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-109-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{132},
  editor =	{Baier, Christel and Chatzigiannakis, Ioannis and Flocchini, Paola and Leonardi, Stefano},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.18},
  URN =		{urn:nbn:de:0030-drops-105942},
  doi =		{10.4230/LIPIcs.ICALP.2019.18},
  annote =	{Keywords: robust distributed clustering, communication complexity}
}
Document
Matrix Completion and Related Problems via Strong Duality

Authors: Maria-Florina Balcan, Yingyu Liang, David P. Woodruff, and Hongyang Zhang

Published in: LIPIcs, Volume 94, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018)


Abstract
This work studies the strong duality of non-convex matrix factorization problems: we show that under certain dual conditions, these problems and its dual have the same optimum. This has been well understood for convex optimization, but little was known for non-convex problems. We propose a novel analytical framework and show that under certain dual conditions, the optimal solution of the matrix factorization program is the same as its bi-dual and thus the global optimality of the non-convex program can be achieved by solving its bi-dual which is convex. These dual conditions are satisfied by a wide class of matrix factorization problems, although matrix factorization problems are hard to solve in full generality. This analytical framework may be of independent interest to non-convex optimization more broadly. We apply our framework to two prototypical matrix factorization problems: matrix completion and robust Principal Component Analysis (PCA). These are examples of efficiently recovering a hidden matrix given limited reliable observations of it. Our framework shows that exact recoverability and strong duality hold with nearly-optimal sample complexity guarantees for matrix completion and robust PCA.

Cite as

Maria-Florina Balcan, Yingyu Liang, David P. Woodruff, and Hongyang Zhang. Matrix Completion and Related Problems via Strong Duality. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 5:1-5:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{balcan_et_al:LIPIcs.ITCS.2018.5,
  author =	{Balcan, Maria-Florina and Liang, Yingyu and Woodruff, David P. and Zhang, Hongyang},
  title =	{{Matrix Completion and Related Problems via Strong Duality}},
  booktitle =	{9th Innovations in Theoretical Computer Science Conference (ITCS 2018)},
  pages =	{5:1--5:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-060-6},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{94},
  editor =	{Karlin, Anna R.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITCS.2018.5},
  URN =		{urn:nbn:de:0030-drops-83583},
  doi =		{10.4230/LIPIcs.ITCS.2018.5},
  annote =	{Keywords: Non-Convex Optimization, Strong Duality, Matrix Completion, Robust PCA, Sample Complexity}
}
Document
Foundations of Unsupervised Learning (Dagstuhl Seminar 16382)

Authors: Maria-Florina Balcan, Shai Ben-David, Ruth Urner, and Ulrike von Luxburg

Published in: Dagstuhl Reports, Volume 6, Issue 9 (2017)


Abstract
This report documents the program and the outcomes of Dagstuhl Seminar 16382 "Foundations of Unsupervised Learning". Unsupervised learning techniques are frequently used in practice of data analysis. However, there is currently little formal guidance as to how, when and to what effect to use which unsupervised learning method. The goal of the seminar was to initiate a broader and more systematic research on the foundations of unsupervised learning with the ultimate aim to provide more support to practitioners. The seminar brought together academic researchers from the fields of theoretical computer science and statistics as well as some researchers from industry.

Cite as

Maria-Florina Balcan, Shai Ben-David, Ruth Urner, and Ulrike von Luxburg. Foundations of Unsupervised Learning (Dagstuhl Seminar 16382). In Dagstuhl Reports, Volume 6, Issue 9, pp. 94-109, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@Article{balcan_et_al:DagRep.6.9.94,
  author =	{Balcan, Maria-Florina and Ben-David, Shai and Urner, Ruth and von Luxburg, Ulrike},
  title =	{{Foundations of Unsupervised Learning (Dagstuhl Seminar 16382)}},
  pages =	{94--109},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2017},
  volume =	{6},
  number =	{9},
  editor =	{Balcan, Maria-Florina and Ben-David, Shai and Urner, Ruth and von Luxburg, Ulrike},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.6.9.94},
  URN =		{urn:nbn:de:0030-drops-69542},
  doi =		{10.4230/DagRep.6.9.94},
  annote =	{Keywords: Machine learning, theory of computing, unsupervised learning, representation learning}
}
Document
k-Center Clustering Under Perturbation Resilience

Authors: Maria-Florina Balcan, Nika Haghtalab, and Colin White

Published in: LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)


Abstract
The k-center problem is a canonical and long-studied facility location and clustering problem with many applications in both its symmetric and asymmetric forms. Both versions of the problem have tight approximation factors on worst case instances: a 2-approximation for symmetric kcenter and an O(log*(k))-approximation for the asymmetric version. Therefore to improve on these ratios, one must go beyond the worst case. In this work, we take this approach and provide strong positive results both for the asymmetric and symmetric k-center problems under a very natural input stability (promise) condition called alpha-perturbation resilience [Bilu Linial, 2012], which states that the optimal solution does not change under any alpha-factor perturbation to the input distances. We show that by assuming 2-perturbation resilience, the exact solution for the asymmetric k-center problem can be found in polynomial time. To our knowledge, this is the first problem that is hard to approximate to any constant factor in the worst case, yet can be optimally solved in polynomial time under perturbation resilience for a constant value of alpha. Furthermore, we prove our result is tight by showing symmetric k-center under (2-epsilon)-perturbation resilience is hard unless NP=RP. This is the first tight result for any problem under perturbation resilience, i.e., this is the first time the exact value of alpha for which the problem switches from being NP-hard to efficiently computable has been found. Our results illustrate a surprising relationship between symmetric and asymmetric k-center instances under perturbation resilience. Unlike approximation ratio, for which symmetric k-center is easily solved to a factor of 2 but asymmetric k-center cannot be approximated to any constant factor, both symmetric and asymmetric k-center can be solved optimally under resilience to 2-perturbations.

Cite as

Maria-Florina Balcan, Nika Haghtalab, and Colin White. k-Center Clustering Under Perturbation Resilience. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 68:1-68:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{balcan_et_al:LIPIcs.ICALP.2016.68,
  author =	{Balcan, Maria-Florina and Haghtalab, Nika and White, Colin},
  title =	{{k-Center Clustering Under Perturbation Resilience}},
  booktitle =	{43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)},
  pages =	{68:1--68:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-013-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{55},
  editor =	{Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.68},
  URN =		{urn:nbn:de:0030-drops-62160},
  doi =		{10.4230/LIPIcs.ICALP.2016.68},
  annote =	{Keywords: k-center, clustering, perturbation resilience}
}
Document
Item Pricing for Revenue Maximization in Combinatorial Auctions

Authors: Maria-Florina Balcan

Published in: Dagstuhl Seminar Proceedings, Volume 7271, Computational Social Systems and the Internet (2007)


Abstract
Consider the problem of a retailer with various goods for sale, attempting to set prices to maximize revenue. If customers have separate valuations over the different goods, and these are known to the retailer, then the goods can be priced separately and the problem is not so difficult. However, when customers have valuations over sets of items, this becomes a combinatorial auction problem, and the problem becomes computationally hard even when valuations are fully known in advance. In this talk we present some simple randomized algorithms and mechanisms for a number of interesting cases of this problem, both in the limited and unlimited supply setting. This talk is based on joint work with Avrim Blum and Yishay Mansour.

Cite as

Maria-Florina Balcan. Item Pricing for Revenue Maximization in Combinatorial Auctions. In Computational Social Systems and the Internet. Dagstuhl Seminar Proceedings, Volume 7271, pp. 1-2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2007)


Copy BibTex To Clipboard

@InProceedings{balcan:DagSemProc.07271.8,
  author =	{Balcan, Maria-Florina},
  title =	{{Item Pricing for Revenue Maximization in Combinatorial Auctions}},
  booktitle =	{Computational Social Systems and the Internet},
  pages =	{1--2},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2007},
  volume =	{7271},
  editor =	{Peter Cramton and Rudolf M\"{u}ller and Eva Tardos and Moshe Tennenholtz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.07271.8},
  URN =		{urn:nbn:de:0030-drops-11534},
  doi =		{10.4230/DagSemProc.07271.8},
  annote =	{Keywords: Item Pricing, Revenue Maximizing, Combinatorial Auctions}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail