Max-Sum Diversity Via Convex Programming

Authors Alfonso Cevallos, Friedrich Eisenbrand, Rico Zenklusen



PDF
Thumbnail PDF

File

LIPIcs.SoCG.2016.26.pdf
  • Filesize: 0.5 MB
  • 14 pages

Document Identifiers

Author Details

Alfonso Cevallos
Friedrich Eisenbrand
Rico Zenklusen

Cite AsGet BibTex

Alfonso Cevallos, Friedrich Eisenbrand, and Rico Zenklusen. Max-Sum Diversity Via Convex Programming. In 32nd International Symposium on Computational Geometry (SoCG 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 51, pp. 26:1-26:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)
https://doi.org/10.4230/LIPIcs.SoCG.2016.26

Abstract

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f that measures diversity of a subset, the task is to select a feasible subset S such that f(S) is maximized. The sum-dispersion function f(S) which is the sum of the pairwise distances in S, is in this context a prominent diversification measure. The corresponding diversity maximization is the "max-sum" or "sum-sum" diversification. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversity problem under a matroid constraint for distances d(.,.) of negative type. Distances of negative type are, for example, metric distances stemming from the l_2 and l_1 norms, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search. Our algorithm is based on techniques developed in geometric algorithms like metric embeddings and convex optimization. We show that one can compute a fractional solution of the usually non-convex relaxation of the problem which yields an upper bound on the optimum integer solution. Starting from this fractional solution, we employ a deterministic rounding approach which only incurs a small loss in terms of objective, thus leading to a PTAS. This technique can be applied to other previously studied variants of the max-sum dispersion function, including combinations of diversity with linear-score maximization, improving over the previous constant-factor approximation algorithms.
Keywords
  • Geometric Dispersion
  • Embeddings
  • Approximation Algorithms
  • Convex Programming
  • Matroids

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Z. Abbassi, V. S. Mirrokni, and M. Thakur. Diversity maximization under matroid constraints. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 32-40. ACM, 2013. Google Scholar
  2. A. A. Ageev and M. I. Sviridenko. Pipage rounding: a new method of constructing algorithms with proven performance guarantee. Journal of Combinatorial Optimization, 8(23):307-328, 2004. Google Scholar
  3. N. Alon, S. Arora, R. Manokaran, D. Moshkovitz, and O. Weinstein. Inapproximability of densest κ-subgraph from average case hardness. Unpublished manuscript, 2011. Google Scholar
  4. S. Bhattacharya, S. Gollapudi, and K. Munagala. Consideration set generation in commerce search. In Proceedings of the 20th international conference on World wide web, pages 317-326. ACM, 2011. Google Scholar
  5. B. Birnbaum and K. J. Goldman. An improved analysis for a greedy remote-clique algorithm using factor-revealing LPs. Algorithmica, 55(1):42-59, 2009. Google Scholar
  6. L. M. Blumenthal. Theory and Applications of Distance Geometry, volume 347. Oxford, 1953. Google Scholar
  7. A. Borodin, H. C. Lee, and Y. Ye. Max-sum diversification, monotone submodular functions and dynamic updates. In Proceedings of the 31st Symposium on Principles of Database Systems, pages 155-166. ACM, 2012. Google Scholar
  8. G. Calinescu, C. Chekuri, M. Pál, and J. Vondrák. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740-1766, 2011. Google Scholar
  9. M. S. Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pages 380-388. ACM, 2002. Google Scholar
  10. C. Chekuri, J. Vondrák, and R. Zenklusen. Dependent randomized rounding via exchange properties of combinatorial structures. In Proceedings of the 51st IEEE Symposium on Foundations of Computer Science, pages 575-584, 2010. Google Scholar
  11. M. M. Deza and M. Laurent. Geometry of Cuts and Metrics. Springer-Verlag, Berlin, 1997. Google Scholar
  12. M. M. Deza and H. Maehara. Metric transforms and euclidean embeddings. Transactions of the American Mathematical Society, 317(2):661-671, 1990. Google Scholar
  13. S. P. Fekete and H. Meijer. Maximum dispersion and geometric maximum weight cliques. Algorithmica, 38(3):501-511, 2004. Google Scholar
  14. M. Fréchet. Les dimensions d'un ensemble abstrait. Mathematische Annalen, 68(2):145-168, 1910. Google Scholar
  15. F. R. Giles. Submodular Functions, Graphs and Integer Polyhedra. PhD thesis, University of Waterloo, 1975. Google Scholar
  16. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In Proceedings of the 18th International Conference on World Wide Web, pages 381-390. ACM, 2009. Google Scholar
  17. J. C. Gower and P. Legendre. Metric and euclidean properties of dissimilarity coefficients. Journal of Classification, 3(1):5-48, 1986. Google Scholar
  18. M. Grötschel, L. Lovász, and A. Schrijver. Geometric Algorithms and Combinatorial Optimization, volume 2 of Algorithms and Combinatorics. Springer, 1988. Google Scholar
  19. R. Hassin, S. Rubinstein, and A. Tamir. Approximation algorithms for maximum dispersion. Operations Research Letters, 21(3):133-137, 1997. Google Scholar
  20. L. G. Khachiyan. A polynomial algorithm in linear programming. Doklady Akademii Nauk SSSR, 244:1093-1097, 1979. Google Scholar
  21. M. K. Kozlov, S. P. Tarasov, and L. G. Khachiyan. The polynomial solvability of convex quadratic programming. USSR Computational Mathematics and Mathematical Physics, 20(5):223-228, 1980. Google Scholar
  22. Q. Lv, M. Charikar, and K. Li. Image similarity search with compact data structures. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pages 208-217. ACM, 2004. Google Scholar
  23. K Makarychev, W Schudy, and M Sviridenko. Concentration inequalities for nonlinear matroid intersection. Random Structures &Algorithms, 46(3):541-571, 2015. Google Scholar
  24. C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval, volume 1. Cambridge university press Cambridge, 2008. Google Scholar
  25. J. Matoušek. Lecture notes on metric embeddings. https://kam.mff.cuni.cz/~matousek/ba-a4.pdf, 2013.
  26. E. Pekalska and R. P. W. Duin. The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Co., Inc., River Edge, NJ, USA, 2005. Google Scholar
  27. P. Raghavan and C. D. Tompson. Randomized rounding: a technique for provably good algorithms and algorithmic proofs. Combinatorica, 7(4):365-374, 1987. Google Scholar
  28. S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42(2):299-310, 1994. Google Scholar
  29. G. Salton and M. J. MacGill. Introduction to modern information retrieval. McGraw-Hill computer science series, 1983. Google Scholar
  30. I. J. Schoenberg. Metric spaces and completely monotone functions. Annals of Mathematics, pages 811-841, 1938. Google Scholar
  31. I. J. Schoenberg. Metric spaces and positive definite functions. Transactions of the American Mathematical Society, 44(3):522-536, 1938. Google Scholar
  32. A. Schrijver. Combinatorial optimization: polyhedra and efficiency, volume 24. Springer Science &Business Media, 2003. Google Scholar
  33. M. Skutella. Convex quadratic and semidefinite programming relaxations in scheduling. Journal of the ACM, 48(2):206-242, 2001. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail