Improved Diversity Maximization Algorithms for Matching and Pseudoforest

Mahabadi, Sepideh; Narayanan, Shyam

doi:10.4230/LIPIcs.APPROX/RANDOM.2023.25

Subject Classification

ACM Subject Classification

Theory of computation → Approximation algorithms analysis
Theory of computation → Computational geometry

Keywords

diversity maximization
approximation algorithms
composable coresets

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

Abstract

In this work we consider the diversity maximization problem, where given a data set X of n elements, and a parameter k, the goal is to pick a subset of X of size k maximizing a certain diversity measure. Chandra and Halldórsson [Barun Chandra and Magnús M. Halldórsson, 2001] defined a variety of diversity measures based on pairwise distances between the points. A constant factor approximation algorithm was known for all those diversity measures except "remote-matching", where only an O(log k) approximation was known. In this work we present an O(1) approximation for this remaining notion. Further, we consider these notions from the perpective of composable coresets. Indyk et al. [Piotr Indyk et al., 2014] provided composable coresets with a constant factor approximation for all but "remote-pseudoforest" and "remote-matching", which again they only obtained a O(log k) approximation. Here we also close the gap up to constants and present a constant factor composable coreset algorithm for these two notions. For remote-matching, our coreset has size only O(k), and for remote-pseudoforest, our coreset has size O(k^{1+ε}) for any ε > 0, for an O(1/ε)-approximate coreset.

Cite As Get BibTex

Sepideh Mahabadi and Shyam Narayanan. Improved Diversity Maximization Algorithms for Matching and Pseudoforest. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 25:1-25:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2023.25

Author Details

Sepideh Mahabadi

Microsoft Research, Redmond, WA, USA

Shyam Narayanan

Massachusetts Institute of Technology, Cambridge, MA, USA

References

Sofiane Abbar, Sihem Amer-Yahia, Piotr Indyk, and Sepideh Mahabadi. Real-time recommendation of diverse related articles. In Proceedings of the 22nd international conference on World Wide Web, pages 1-12, 2013.
Sofiane Abbar, Sihem Amer-Yahia, Piotr Indyk, Sepideh Mahabadi, and Kasturi R Varadarajan. Diverse near neighbor problem. In Proceedings of the twenty-ninth annual symposium on Computational geometry, pages 207-214, 2013.
Zeinab Abbassi, Vahab S Mirrokni, and Mayur Thakur. Diversity Maximization Under Matroid Constraints. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pages 32-40, 2013.
Sepideh Aghamolaei, Majid Farhadi, and Hamid Zarrabi-Zadeh. Diversity maximization via composable coresets. In CCCG, pages 38-48, 2015.
Albert Angel and Nick Koudas. Efficient diversity-aware search. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 781-792, 2011.
Sepehr Assadi and Sanjeev Khanna. Randomized composable coresets for matching and vertex cover. arXiv preprint, 2017. URL: https://arxiv.org/abs/1705.08242.
Aditya Bhaskara, Mehrdad Ghadiri, Vahab S. Mirrokni, and Ola Svensson. Linear relaxations for finding diverse elements in metric spaces. In Advances in Neural Information Processing Systems, pages 4098-4106, 2016.
Allan Borodin, Hyun Chul Lee, and Yuli Ye. Max-sum diversification, monotone submodular functions and dynamic updates. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems, pages 155-166, 2012.
Matteo Ceccarello, Andrea Pietracaprina, Geppino Pucci, and Eli Upfal. Mapreduce and streaming algorithms for diversity maximization in metric spaces of bounded doubling dimension. arXiv preprint, 2016. URL: https://arxiv.org/abs/1605.05590.
Alfonso Cevallos, Friedrich Eisenbrand, and Sarah Morell. Diversity maximization in doubling metrics. arXiv preprint, 2018. URL: https://arxiv.org/abs/1809.09521.
Barun Chandra and Magnús M. Halldórsson. Approximation algorithms for dispersion problems. J. Algorithms, 38(2):438-465, 2001.
Artur Czumaj and Christian Sohler. Estimating the weight of metric minimum spanning trees in sublinear time. SIAM J. Comput., 39(3):904-922, 2009.
Marina Drosou and Evaggelia Pitoura. Search result diversification. ACM SIGMOD Record, 39(1):41-47, 2010.
Alessandro Epasto, Mohammad Mahdian, Vahab Mirrokni, and Peilin Zhong. Improved sliding window algorithms for clustering and coverage via bucketing-based sketches. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3005-3042. SIAM, 2022.
Alessandro Epasto, Vahab Mirrokni, and Morteza Zadimoghaddam. Scalable diversity maximization via small-size composable core-sets (brief announcement). In The 31st ACM symposium on parallelism in algorithms and architectures, pages 41-42, 2019.
E. N. Gilbert and H. O. Pollak. Steiner minimal trees. SIAM J. Appl. Math., 16:1-29, 1968.
Sreenivas Gollapudi and Aneesh Sharma. An axiomatic approach for result diversification. In Proceedings of the 18th international conference on World wide web, pages 381-390, 2009.
Boqing Gong, Wei-Lun Chao, Kristen Grauman, and Fei Sha. Diverse sequential subset selection for supervised video summarization. Advances in neural information processing systems, 27, 2014.
Teofilo F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci., 38:293-306, 1985.
Magnús M Halldórsson, Kazuo Iwano, Naoki Katoh, and Takeshi Tokuyama. Finding subsets maximizing minimum structures. SIAM Journal on Discrete Mathematics, 12(3):342-359, 1999.
Piotr Indyk. Algorithms for dynamic geometric problems over data streams. In László Babai, editor, Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004, pages 373-380. ACM, 2004.
Piotr Indyk, Sepideh Mahabadi, Shayan Oveis Gharan, and Alireza Rezaei. Composable core-sets for determinant maximization problems via spectral spanners. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1675-1694. SIAM, 2020.
Piotr Indyk, Sepideh Mahabadi, Mohammad Mahdian, and Vahab S. Mirrokni. Composable core-sets for diversity and coverage maximization. In Richard Hull and Martin Grohe, editors, Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS'14, Snowbird, UT, USA, June 22-27, 2014, pages 100-108. ACM, 2014.
Anoop Jain, Parag Sarda, and Jayant R Haritsa. Providing diversity in k-nearest neighbor query results. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 404-413. Springer, 2004.
Hui Lin and Jeff Bilmes. A class of submodular functions for document summarization. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, pages 510-520, 2011.
Hui Lin, Jeff Bilmes, and Shasha Xie. Graph-based submodular selection for extractive summarization. In 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, pages 381-386. IEEE, 2009.
Sepideh Mahabadi, Piotr Indyk, Shayan Oveis Gharan, and Alireza Rezaei. Composable core-sets for determinant maximization: A simple near-optimal algorithm. In International Conference on Machine Learning, pages 4254-4263. PMLR, 2019.
Vahab Mirrokni and Morteza Zadimoghaddam. Randomized composable core-sets for distributed submodular maximization. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 153-162. ACM, 2015.
Julien Pilourdault, Sihem Amer-Yahia, Dongwon Lee, and Senjuti Basu Roy. Motivation-aware task assignment in crowdsourcing. In EDBT, 2017.
S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Facility dispersion problems: Heuristics and special cases. Algorithms and Data Structures, 519:355-366, 1991.
Michael J Welch, Junghoo Cho, and Christopher Olston. Search result diversity for informational queries. In Proceedings of the 20th international conference on World wide web, pages 237-246, 2011.
Cong Yu, Laks VS Lakshmanan, and Sihem Amer-Yahia. Recommendation diversification using explanations. In 2009 IEEE 25th International Conference on Data Engineering, pages 1299-1302. IEEE, 2009.
Tao Zhou, Zoltán Kuscsik, Jian-Guo Liu, Matúš Medo, Joseph Rushton Wakeling, and Yi-Cheng Zhang. Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences, 107(10):4511-4515, 2010.
Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen. Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web, pages 22-32, 2005.

Improved Diversity Maximization Algorithms for Matching and Pseudoforest

Authors Sepideh Mahabadi, Shyam Narayanan

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Improved Diversity Maximization Algorithms for Matching and Pseudoforest

Authors Sepideh Mahabadi, Shyam Narayanan

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message