Approximate Maximum Rank Aggregation: Beyond the Worst-Case

Alvin, Yan Hong Yao; Chakraborty, Diptarka

doi:10.4230/LIPIcs.FSTTCS.2023.12

Abstract

The fundamental task of rank aggregation is to combine multiple rankings on a group of candidates into a single ranking to mitigate biases inherent in individual input rankings. This task has a myriad of applications, such as in social choice theory, collaborative filtering, web search, statistics, databases, sports, and admission systems. One popular version of this task, maximum rank aggregation (or the center ranking problem), aims to find a ranking (not necessarily from the input set) that minimizes the maximum distance to the input rankings. However, even for four input rankings, this problem is NP-hard (Dwork et al., WWW'01, and Biedl et al., Discrete Math.'09), and only a (folklore) polynomial-time 2-approximation algorithm is known for finding an optimal aggregate ranking under the commonly used Kendall-tau distance metric. Achieving a better approximation factor in polynomial time, ideally, a polynomial time approximation scheme (PTAS), is one of the major challenges.
This paper presents significant progress in solving this problem by considering the Mallows model, a classical probabilistic model. Our proposed algorithm outputs an (1+ε)-approximate aggregate ranking for any ε > 0, with high probability, as long as the input rankings come from a Mallows model, even in a streaming fashion. Furthermore, the same approximation guarantee is achieved even in the presence of outliers, presumably a more challenging task.

Nir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), 55(5):1-27, 2008.
Pranjal Awasthi, Avrim Blum, Or Sheffet, and Aravindan Vijayaraghavan. Learning mixtures of ranking models. Advances in Neural Information Processing Systems, 27, 2014.
Christian Bachmaier, Franz J. Brandenburg, Andreas Gleißner, and Andreas Hofmeier. On the hardness of maximum rank aggregation problems. Journal of Discrete Algorithms, 31:2-13, 2015. 24th International Workshop on Combinatorial Algorithms (IWOCA 2013).
Mihai Bādoiu, Sariel Har-Peled, and Piotr Indyk. Approximate clustering via core-sets. In Proceedings of the thiry-fourth annual ACM Symposium on Theory of Computing, pages 250-257, 2002.
Therese Biedl, Franz J Brandenburg, and Xiaotie Deng. On the complexity of crossings in permutations. Discrete Mathematics, 309(7):1813-1823, 2009.
Ralph Allan Bradley and Milton E Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324-345, 1952.
Felix Brandt, Vincent Conitzer, Ulle Endriss, Jérôme Lang, and Ariel D Procaccia. Handbook of computational social choice. Cambridge University Press, 2016.
Mark Braverman and Elchanan Mossel. Sorting from noisy information. arXiv preprint, 2009. URL: https://arxiv.org/abs/0910.1191.
Marc Bury and Chris Schwiegelshohn. On finding the jaccard center. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017.
Ioannis Caragiannis, Ariel D Procaccia, and Nisarg Shah. When do noisy votes reveal the truth? ACM Transactions on Economics and Computation (TEAC), 4(3):1-30, 2016.
Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Approximating the median under the ulam metric. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 761-775. SIAM, 2021.
Diptarka Chakraborty, Debarati Das, and Robert Krauthgamer. Clustering permutations: New techniques with streaming applications. In Yael Tauman Kalai, editor, 14th Innovations in Theoretical Computer Science Conference, ITCS 2023, January 10-13, 2023, MIT, Cambridge, Massachusetts, USA, volume 251 of LIPIcs, pages 31:1-31:24. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023.
Diptarka Chakraborty, Kshitij Gajjar, and Agastya Vibhuti Jha. Approximating the center ranking under ulam. In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
Flavio Chierichetti, Anirban Dasgupta, Ravi Kumar, and Silvio Lattanzi. On reconstructing a hidden permutation. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014). Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2014. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.604.
Fabien Collas and Ekhine Irurozki. Concentric mixtures of mallows models for top-k rankings: sampling and identifiability. In International Conference on Machine Learning, pages 2079-2088. PMLR, 2021.
Anindya De, Ryan O'Donnell, and Rocco Servedio. Learning sparse mixtures of rankings from noisy information. arXiv preprint, 2018. URL: https://arxiv.org/abs/1811.01216.
Persi Diaconis and R. L. Graham. Spearman’s footrule as a measure of disarray. Journal of the Royal Statistical Society. Series B (Methodological), 39(2):262-268, 1977. URL: http://www.jstor.org/stable/2984804.
Jean-Paul Doignon, Aleksandar Pekeč, and Michel Regenwetter. The repeated insertion model for rankings: Missing link between two subset choice models. Psychometrika, 69(1):33-54, 2004.
Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Sivakumar. Rank aggregation methods for the web. In Proceedings of the Tenth International World Wide Web Conference, WWW 10, pages 613-622, 2001. URL: https://doi.org/10.1145/371920.372165.
Ronald Fagin, Ravi Kumar, and D. Sivakumar. Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, California, USA, June 9-12, 2003, pages 301-312, 2003.
Moti Frances and Ami Litman. On covering problems of codes. Theory of Computing Systems, 30(2):113-119, 1997.
David F Gleich and Lek-heng Lim. Rank aggregation via nuclear norm minimization. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 60-68, 2011.
Donna Harman. Ranking algorithms. In William B. Frakes and Ricardo A. Baeza-Yates, editors, Information Retrieval: Data Structures & Algorithms, pages 363-392. Prentice-Hall, 1992.
Kenneth Hung and William Fithian. Rank verification for exponential families. The Annals of Statistics, 47(2):758-782, 2019.
David R Hunter. Mm algorithms for generalized bradley-terry models. The annals of statistics, 32(1):384-406, 2004.
John G Kemeny. Mathematics without numbers. Daedalus, 88(4):577-591, 1959.
Maurice G Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81-93, 1938.
Claire Kenyon-Mathieu and Warren Schudy. How to rank with few errors. In Proceedings of the thirty-ninth annual ACM Symposium on Theory of Computing, pages 95-103, 2007.
Ashish Khetan and Sewoong Oh. Data-driven rank breaking for efficient rank aggregation. In International Conference on Machine Learning, pages 89-98. PMLR, 2016.
Ming Li, Bin Ma, and Lusheng Wang. On the closest string and substring problems. Journal of the ACM (JACM), 49(2):157-171, March 2002.
Allen Liu and Ankur Moitra. Efficiently learning mixtures of mallows models. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 627-638. IEEE, 2018.
Allen Liu and Ankur Moitra. Robust voting rules from algorithmic robust statistics. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3471-3512. SIAM, 2023.
R Duncan Luce. Individual choice behavior, 1959.
Bin Ma and Xiaoming Sun. More efficient algorithms for closest string and substring problems. SIAM Journal on Computing, 39(4):1432-1443, 2010.
Colin L Mallows. Non-null ranking models. i. Biometrika, 44(1/2):114-130, 1957.
Nimrod Megiddo. Linear programming in linear time when the dimension is fixed. Journal of the ACM (JACM), 31(1):114-127, 1984.
François Nicolas and Eric Rivals. Complexities of the centre and median string problems. In Combinatorial Pattern Matching, 14th Annual Symposium, CPM 2003, Morelia, Michocán, Mexico, June 25-27, 2003, Proceedings, pages 315-327, 2003.
Vasyl Pihur, Susmita Datta, and Somnath Datta. Weighted rank aggregation of cluster validation measures: a monte carlo cross-entropy approach. Bioinformatics, 23(13):1607-1615, 2007.
Robin L Plackett. The analysis of permutations. Journal of the Royal Statistical Society: Series C (Applied Statistics), 24(2):193-202, 1975.
V Yu Popov. Multiple genome rearrangement by swaps and by element duplications. Theoretical computer science, 385(1-3):115-126, 2007.
Antti-Veikko Rosti, Necip Fazil Ayan, Bing Xiang, Spyros Matsoukas, Richard Schwartz, and Bonnie Dorr. Combining outputs from multiple machine translation systems. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 228-235, 2007.
Warren Schudy. Approximation schemes for inferring rankings and clusterings from pairwise data. Ph.D. diss., Brown University, 2012.
Nihar B Shah and Martin J Wainwright. Simple, robust and optimal ranking from pairwise comparisons. The Journal of Machine Learning Research, 18(1):7246-7283, 2017.
James Joseph Sylvester. A question in the geometry of situation. Quarterly Journal of Pure and Applied Mathematics, 1(1):79-80, 1857.
Wenpin Tang. Mallows ranking models: maximum likelihood estimate and regeneration. In International Conference on Machine Learning, pages 6125-6134. PMLR, 2019.
E Alper Yildirim. Two algorithms for the minimum enclosing ball problem. SIAM Journal on Optimization, 19(3):1368-1391, 2008.
H Peyton Young. Condorcet’s theory of voting. American Political science review, 82(4):1231-1244, 1988.
H Peyton Young and Arthur Levenglick. A consistent extension of condorcet’s election principle. SIAM Journal on Applied Mathematics, 35(2):285-300, 1978.

Approximate Maximum Rank Aggregation: Beyond the Worst-Case

Authors Yan Hong Yao Alvin, Diptarka Chakraborty

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Approximate Maximum Rank Aggregation: Beyond the Worst-Case

Authors Yan Hong Yao Alvin, Diptarka Chakraborty

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message