Max-Sum Diversity Via Convex Programming

Cevallos, Alfonso; Eisenbrand, Friedrich; Zenklusen, Rico

doi:10.4230/LIPIcs.SoCG.2016.26

File

Subject Classification

Keywords

Geometric Dispersion
Embeddings
Approximation Algorithms
Convex Programming
Matroids

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f that measures diversity of a subset, the task is to select a feasible subset S such that f(S) is maximized. The sum-dispersion function f(S) which is the sum of the pairwise distances in S, is in this context a prominent diversification measure. The corresponding diversity maximization is the "max-sum" or "sum-sum" diversification. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversity problem under a matroid constraint for distances d(.,.) of negative type. Distances of negative type are, for example, metric distances stemming from the l_2 and l_1 norms, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search. Our algorithm is based on techniques developed in geometric algorithms like metric embeddings and convex optimization. We show that one can compute a fractional solution of the usually non-convex relaxation of the problem which yields an upper bound on the optimum integer solution. Starting from this fractional solution, we employ a deterministic rounding approach which only incurs a small loss in terms of objective, thus leading to a PTAS. This technique can be applied to other previously studied variants of the max-sum dispersion function, including combinations of diversity with linear-score maximization, improving over the previous constant-factor approximation algorithms.

Cite As Get BibTex

Alfonso Cevallos, Friedrich Eisenbrand, and Rico Zenklusen. Max-Sum Diversity Via Convex Programming. In 32nd International Symposium on Computational Geometry (SoCG 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 51, pp. 26:1-26:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016) https://doi.org/10.4230/LIPIcs.SoCG.2016.26

Author Details

Alfonso Cevallos

Friedrich Eisenbrand

Rico Zenklusen

References

Z. Abbassi, V. S. Mirrokni, and M. Thakur. Diversity maximization under matroid constraints. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 32-40. ACM, 2013.
A. A. Ageev and M. I. Sviridenko. Pipage rounding: a new method of constructing algorithms with proven performance guarantee. Journal of Combinatorial Optimization, 8(23):307-328, 2004.
N. Alon, S. Arora, R. Manokaran, D. Moshkovitz, and O. Weinstein. Inapproximability of densest κ-subgraph from average case hardness. Unpublished manuscript, 2011.
S. Bhattacharya, S. Gollapudi, and K. Munagala. Consideration set generation in commerce search. In Proceedings of the 20th international conference on World wide web, pages 317-326. ACM, 2011.
B. Birnbaum and K. J. Goldman. An improved analysis for a greedy remote-clique algorithm using factor-revealing LPs. Algorithmica, 55(1):42-59, 2009.
L. M. Blumenthal. Theory and Applications of Distance Geometry, volume 347. Oxford, 1953.
A. Borodin, H. C. Lee, and Y. Ye. Max-sum diversification, monotone submodular functions and dynamic updates. In Proceedings of the 31st Symposium on Principles of Database Systems, pages 155-166. ACM, 2012.
G. Calinescu, C. Chekuri, M. Pál, and J. Vondrák. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740-1766, 2011.
M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pages 380-388. ACM, 2002.
C. Chekuri, J. Vondrák, and R. Zenklusen. Dependent randomized rounding via exchange properties of combinatorial structures. In Proceedings of the 51st IEEE Symposium on Foundations of Computer Science, pages 575-584, 2010.
M. M. Deza and M. Laurent. Geometry of Cuts and Metrics. Springer-Verlag, Berlin, 1997.
M. M. Deza and H. Maehara. Metric transforms and euclidean embeddings. Transactions of the American Mathematical Society, 317(2):661-671, 1990.
S. P. Fekete and H. Meijer. Maximum dispersion and geometric maximum weight cliques. Algorithmica, 38(3):501-511, 2004.
M. Fréchet. Les dimensions d'un ensemble abstrait. Mathematische Annalen, 68(2):145-168, 1910.
F. R. Giles. Submodular Functions, Graphs and Integer Polyhedra. PhD thesis, University of Waterloo, 1975.
S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of the 18th International Conference on World Wide Web, pages 381-390. ACM, 2009.
J. C. Gower and P. Legendre. Metric and euclidean properties of dissimilarity coefficients. Journal of Classification, 3(1):5-48, 1986.
M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer, 1988.
R. Hassin, S. Rubinstein, and A. Tamir. Approximation algorithms for maximum dispersion. Operations Research Letters, 21(3):133-137, 1997.
L. G. Khachiyan. A polynomial algorithm in linear programming. Doklady Akademii Nauk SSSR, 244:1093-1097, 1979.
M. K. Kozlov, S. P. Tarasov, and L. G. Khachiyan. The polynomial solvability of convex quadratic programming. USSR Computational Mathematics and Mathematical Physics, 20(5):223-228, 1980.
Q. Lv, M. Charikar, and K. Li. Image similarity search with compact data structures. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pages 208-217. ACM, 2004.
K Makarychev, W Schudy, and M Sviridenko. Concentration inequalities for nonlinear matroid intersection. Random Structures &Algorithms, 46(3):541-571, 2015.
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval, volume 1. Cambridge university press Cambridge, 2008.
J. Matoušek. Lecture notes on metric embeddings. https://kam.mff.cuni.cz/~matousek/ba-a4.pdf, 2013.
E. Pekalska and R. P. W. Duin. The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Co., Inc., River Edge, NJ, USA, 2005.
P. Raghavan and C. D. Tompson. Randomized rounding: a technique for provably good algorithms and algorithmic proofs. Combinatorica, 7(4):365-374, 1987.
S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42(2):299-310, 1994.
G. Salton and M. J. MacGill. Introduction to modern information retrieval. McGraw-Hill computer science series, 1983.
I. J. Schoenberg. Metric spaces and completely monotone functions. Annals of Mathematics, pages 811-841, 1938.
I. J. Schoenberg. Metric spaces and positive definite functions. Transactions of the American Mathematical Society, 44(3):522-536, 1938.
A. Schrijver. Combinatorial optimization: polyhedra and efficiency, volume 24. Springer Science &Business Media, 2003.
M. Skutella. Convex quadratic and semidefinite programming relaxations in scheduling. Journal of the ACM, 48(2):206-242, 2001.

Max-Sum Diversity Via Convex Programming

Authors Alfonso Cevallos, Friedrich Eisenbrand, Rico Zenklusen

File

Document Identifiers

Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message