LIPIcs.APPROX-RANDOM.2017.19.pdf
- Filesize: 426 kB
- 11 pages
Many problems in data mining and unsupervised machine learning take the form of minimizing a set function with cardinality constraints. More explicitly, denote by [n] the set {1,...,n} and let f(S) be a function from 2^[n] to R+. Our goal is to minimize f(S) subject to |S| <= k. These problems include clustering and covering problems as well as sparse regression, matrix approximation problems and many others. These combinatorial problems are hard to minimize in general. Finding good (e.g. constant factor) approximate solutions for them requires significant sophistication and highly specialized algorithms. In this paper we analyze the behavior of the greedy algorithm to all of these problems. We start by claiming that the functions above are special. A trivial observation is that they are non-negative and non-increasing, that is, f(S) >= f(union(S,T)) >= 0 for any S and T. This immediately shows that expanding solution sets is (at least potentially) beneficial in terms of reducing the function value. But, monotonicity is not sufficient to ensure that any number of greedy extensions of a given solution would significantly reduce the objective function.
Feedback for Dagstuhl Publishing