Document

**Published in:** LIPIcs, Volume 123, 29th International Symposium on Algorithms and Computation (ISAAC 2018)

Diversity maximization is an important geometric optimization problem with many applications in recommender systems, machine learning or search engines among others. A typical diversification problem is as follows: Given a finite metric space (X,d) and a parameter k in N, find a subset of k elements of X that has maximum diversity. There are many functions that measure diversity. One of the most popular measures, called remote-clique, is the sum of the pairwise distances of the chosen elements. In this paper, we present novel results on three widely used diversity measures: Remote-clique, remote-star and remote-bipartition.
Our main result are polynomial time approximation schemes for these three diversification problems under the assumption that the metric space is doubling. This setting has been discussed in the recent literature. The existence of such a PTAS however was left open.
Our results also hold in the setting where the distances are raised to a fixed power q >= 1, giving rise to more variants of diversity functions, similar in spirit to the variations of clustering problems depending on the power applied to the pairwise distances. Finally, we provide a proof of NP-hardness for remote-clique with squared distances in doubling metric spaces.

Alfonso Cevallos, Friedrich Eisenbrand, and Sarah Morell. Diversity Maximization in Doubling Metrics. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 33:1-33:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{cevallos_et_al:LIPIcs.ISAAC.2018.33, author = {Cevallos, Alfonso and Eisenbrand, Friedrich and Morell, Sarah}, title = {{Diversity Maximization in Doubling Metrics}}, booktitle = {29th International Symposium on Algorithms and Computation (ISAAC 2018)}, pages = {33:1--33:12}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-094-1}, ISSN = {1868-8969}, year = {2018}, volume = {123}, editor = {Hsu, Wen-Lian and Lee, Der-Tsai and Liao, Chung-Shou}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2018.33}, URN = {urn:nbn:de:0030-drops-99818}, doi = {10.4230/LIPIcs.ISAAC.2018.33}, annote = {Keywords: Remote-clique, remote-star, remote-bipartition, doubling dimension, grid rounding, epsilon-nets, polynomial time approximation scheme, facility location, information retrieval} }

Document

**Published in:** LIPIcs, Volume 51, 32nd International Symposium on Computational Geometry (SoCG 2016)

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f that measures diversity of a subset, the task is to select a feasible subset S such that f(S) is maximized. The sum-dispersion function f(S) which is the sum of the pairwise distances in S, is in this context a prominent diversification measure. The corresponding diversity maximization is the "max-sum" or "sum-sum" diversification. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint.
In this paper, we present a PTAS for the max-sum diversity problem under a matroid constraint for distances d(.,.) of negative type. Distances of negative type are, for example, metric distances stemming from the l_2 and l_1 norms, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search.
Our algorithm is based on techniques developed in geometric algorithms like metric embeddings and convex optimization. We show that one can compute a fractional solution of the usually non-convex relaxation of the problem which yields an upper bound on the optimum integer solution. Starting from this fractional solution, we employ a deterministic rounding approach which only incurs a small loss in terms of objective, thus leading to a PTAS. This technique can be applied to other previously studied variants of the max-sum dispersion function, including combinations of diversity with linear-score maximization, improving over the previous constant-factor approximation algorithms.

Alfonso Cevallos, Friedrich Eisenbrand, and Rico Zenklusen. Max-Sum Diversity Via Convex Programming. In 32nd International Symposium on Computational Geometry (SoCG 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 51, pp. 26:1-26:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{cevallos_et_al:LIPIcs.SoCG.2016.26, author = {Cevallos, Alfonso and Eisenbrand, Friedrich and Zenklusen, Rico}, title = {{Max-Sum Diversity Via Convex Programming}}, booktitle = {32nd International Symposium on Computational Geometry (SoCG 2016)}, pages = {26:1--26:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-009-5}, ISSN = {1868-8969}, year = {2016}, volume = {51}, editor = {Fekete, S\'{a}ndor and Lubiw, Anna}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2016.26}, URN = {urn:nbn:de:0030-drops-59186}, doi = {10.4230/LIPIcs.SoCG.2016.26}, annote = {Keywords: Geometric Dispersion, Embeddings, Approximation Algorithms, Convex Programming, Matroids} }