Greedy Minimization of Weakly Supermodular Set Functions

Authors Edo Liberty, Maxim Sviridenko



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2017.19.pdf
  • Filesize: 426 kB
  • 11 pages

Document Identifiers

Author Details

Edo Liberty
Maxim Sviridenko

Cite As Get BibTex

Edo Liberty and Maxim Sviridenko. Greedy Minimization of Weakly Supermodular Set Functions. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 81, pp. 19:1-19:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2017.19

Abstract

Many problems in data mining and unsupervised machine learning take the form of minimizing a set function with cardinality constraints. More explicitly, denote by [n] the set {1,...,n} and let f(S) be a function from 2^[n] to R+. Our goal is to minimize f(S) subject to |S| <= k. These problems include clustering and covering problems as well as sparse regression, matrix approximation problems and many others. These combinatorial problems are hard to minimize in general. Finding good (e.g. constant factor) approximate solutions for them requires significant sophistication and highly specialized algorithms.

In this paper we analyze the behavior of the greedy algorithm to all of these problems. We start by claiming that the functions above are special. A trivial observation is that they are non-negative and non-increasing, that is, f(S) >= f(union(S,T)) >= 0 for any S and T. This immediately shows that expanding solution sets is (at least potentially) beneficial in terms of reducing the function value. But, monotonicity is not sufficient to ensure that any number of greedy extensions of a given solution would significantly reduce the objective function.

Subject Classification

Keywords
  • Weak Supermodularity
  • Greedy Algorithms
  • Machine Learning
  • Data Mining

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ankit Aggarwal, Amit Deshpande, and Ravi Kannan. Adaptive sampling for k-means clustering. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, 12th International Workshop, APPROX 2009, and 13th International Workshop, RANDOM 2009, Berkeley, CA, USA, August 21-23, 2009. Proceedings, pages 15-28, 2009. URL: http://dx.doi.org/10.1007/978-3-642-03685-9_2.
  2. David Arthur and Sergei Vassilvitskii. k-means++: the advantages of careful seeding. In SODA, pages 1027-1035, 2007. Google Scholar
  3. C. Boutsidis, P. Drineas, and M. Magdon-Ismail. Near-optimal column-based matrix reconstruction. SIAM Journal on Computing, 43(2):687-717, 2014. Google Scholar
  4. T. F. Chan and P. C. Hansen. Some applications of the rank revealing qr factorization. SIAM Journal on Scientific and Statistical Computing, 13:727, 1992. Google Scholar
  5. A. Das and D. Kempe. Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection. In In Proceedings of ICML, pages 1057-1064, 2011. Google Scholar
  6. A. Deshpande and L. Rademacher. Efficient volume sampling for row/column subset selection. In Proceedings of the 42th Annual ACM Symposium on Theory of Computing (STOC), 2010. Google Scholar
  7. Amit Deshpande, Luis Rademacher, Santosh Vempala, and Grant Wang. Matrix approximation and projective clustering via volume sampling. Theory of Computing, 2:225-247, 2006. Google Scholar
  8. Dan Feldman, Amos Fiat, Micha Sharir, and Danny Segev. Bi-criteria linear-time approximations for generalized k-mean/median/center. In Proceedings of the Twenty-third Annual Symposium on Computational Geometry, SCG'07, pages 19-26, New York, NY, USA, 2007. ACM. URL: http://dx.doi.org/10.1145/1247069.1247073.
  9. Dan Feldman and Michael Langberg. A unified framework for approximating and clustering data. In Proceedings of the Forty-third Annual ACM Symposium on Theory of Computing, STOC'11, pages 569-578, New York, NY, USA, 2011. ACM. URL: http://dx.doi.org/10.1145/1993636.1993712.
  10. Dean P. Foster, Howard J. Karloff, and Justin Thaler. Variable selection is hard. In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, pages 696-709, 2015. Google Scholar
  11. G. H. Golub. Numerical methods for solving linear least squares problems. Numer. Math., 7:206-216, 1965. Google Scholar
  12. M. Gu and S. C. Eisenstat. Efficient algorithms for computing a strong efficient algorithms for computing a strong rank-revealing qr-factorization. SIAM Journal on Scientific Computing, 17(848-869), 1996. Google Scholar
  13. K. Makarychev, Y. Makarychev, M. Sviridenko, and J. Ward. A bi-criteria approximation algorithm for k means. In submission, 2015. Google Scholar
  14. B. K. Natarajan. Sparse approximate solutions to linear systems. SIAM J. Comput., 24(2):227-234, April 1995. URL: http://dx.doi.org/10.1137/S0097539792240406.
  15. G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for maximizing submodular set functions - i. Mathematical Programming, 14(1):265-294, 1978. URL: http://dx.doi.org/10.1007/BF01588971.
  16. S. Shalev-Shwartz, N. Srebro, and T. Zhang. Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM Journal on Optimization, 20(6):2807-2832, 2010. Google Scholar
  17. Maxim Sviridenko, Jan Vondrak, and Justin Ward. Optimal approximation for submodular and supermodular optimization with bounded curvature. In Proceedings of SODA 2015, pages 1134-1148, 2014. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail