LIPIcs.SoCG.2016.26.pdf
- Filesize: 0.5 MB
- 14 pages
Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f that measures diversity of a subset, the task is to select a feasible subset S such that f(S) is maximized. The sum-dispersion function f(S) which is the sum of the pairwise distances in S, is in this context a prominent diversification measure. The corresponding diversity maximization is the "max-sum" or "sum-sum" diversification. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversity problem under a matroid constraint for distances d(.,.) of negative type. Distances of negative type are, for example, metric distances stemming from the l_2 and l_1 norms, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search. Our algorithm is based on techniques developed in geometric algorithms like metric embeddings and convex optimization. We show that one can compute a fractional solution of the usually non-convex relaxation of the problem which yields an upper bound on the optimum integer solution. Starting from this fractional solution, we employ a deterministic rounding approach which only incurs a small loss in terms of objective, thus leading to a PTAS. This technique can be applied to other previously studied variants of the max-sum dispersion function, including combinations of diversity with linear-score maximization, improving over the previous constant-factor approximation algorithms.
Feedback for Dagstuhl Publishing