Improved Approximation Algorithms and Lower Bounds for Search-Diversification Problems

eng Schloss Dagstuhl – Leibniz-Zentrum für Informatik Leibniz International Proceedings in Informatics 1868-8969 2022-06-28 7:1 7:18 10.4230/LIPIcs.ICALP.2022.7 article Improved Approximation Algorithms and Lower Bounds for Search-Diversification Problems Abboud, Amir 1 Cohen-Addad, Vincent 2 Lee, Euiwoong 3 Manurangsi, Pasin 4 Weizmann Institute of Science, Rehovot, Israel Google Research, Zürich, Switzerland University of Michigan, Ann Arbor, MI, USA Google Research, Mountain View, CA, USA We study several questions related to diversifying search results. We give improved approximation algorithms in each of the following problems, together with some lower bounds. 1) We give a polynomial-time approximation scheme (PTAS) for a diversified search ranking problem [Nikhil Bansal et al., 2010] whose objective is to minimizes the discounted cumulative gain. Our PTAS runs in time n^{2^O(log(1/ε)/ε)} ⋅ m^O(1) where n denotes the number of elements in the databases and m denotes the number of constraints. Complementing this result, we show that no PTAS can run in time f(ε) ⋅ (nm)^{2^o(1/ε)} assuming Gap-ETH and therefore our running time is nearly tight. Both our upper and lower bounds answer open questions from [Nikhil Bansal et al., 2010]. 2) We next consider the Max-Sum Dispersion problem, whose objective is to select k out of n elements from a database that maximizes the dispersion, which is defined as the sum of the pairwise distances under a given metric. We give a quasipolynomial-time approximation scheme (QPTAS) for the problem which runs in time n^{O_ε(log n)}. This improves upon previously known polynomial-time algorithms with approximate ratios 0.5 [Refael Hassin et al., 1997; Allan Borodin et al., 2017]. Furthermore, we observe that reductions from previous work rule out approximation schemes that run in n^õ_ε(log n) time assuming ETH. 3) Finally, we consider a generalization of Max-Sum Dispersion called Max-Sum Diversification. In addition to the sum of pairwise distance, the objective also includes another function f. For monotone submodular function f, we give a quasipolynomial-time algorithm with approximation ratio arbitrarily close to (1-1/e). This improves upon the best polynomial-time algorithm which has approximation ratio 0.5 [Allan Borodin et al., 2017]. Furthermore, the (1-1/e) factor is also tight as achieving better-than-(1-1/e) approximation is NP-hard [Uriel Feige, 1998]. https://drops.dagstuhl.de/storage/00lipics/lipics-vol229-icalp2022/LIPIcs.ICALP.2022.7/LIPIcs.ICALP.2022.7.pdf Approximation Algorithms Complexity Data Mining Diversification

<publisher>Schloss Dagstuhl – Leibniz-Zentrum für Informatik</publisher>

<journalTitle>Leibniz International Proceedings in Informatics</journalTitle>

<doi>10.4230/LIPIcs.ICALP.2022.7</doi>

<documentType>article</documentType>

<title language="eng">Improved Approximation Algorithms and Lower Bounds for Search-Diversification Problems</title>

<name>Abboud, Amir</name>

</author>

<name>Cohen-Addad, Vincent</name>

</author>

<name>Lee, Euiwoong</name>

</author>

<name>Manurangsi, Pasin</name>

</author>

</authors>

<affiliationName affiliationId="1">Weizmann Institute of Science, Rehovot, Israel</affiliationName>

<affiliationName affiliationId="2">Google Research, Zürich, Switzerland</affiliationName>

<affiliationName affiliationId="3">University of Michigan, Ann Arbor, MI, USA</affiliationName>

<affiliationName affiliationId="4">Google Research, Mountain View, CA, USA</affiliationName>

</affiliationsList>

<abstract language="eng">We study several questions related to diversifying search results. We give improved approximation algorithms in each of the following problems, together with some lower bounds. 1) We give a polynomial-time approximation scheme (PTAS) for a diversified search ranking problem [Nikhil Bansal et al., 2010] whose objective is to minimizes the discounted cumulative gain. Our PTAS runs in time n^{2^O(log(1/ε)/ε)} ⋅ m^O(1) where n denotes the number of elements in the databases and m denotes the number of constraints. Complementing this result, we show that no PTAS can run in time f(ε) ⋅ (nm)^{2^o(1/ε)} assuming Gap-ETH and therefore our running time is nearly tight. Both our upper and lower bounds answer open questions from [Nikhil Bansal et al., 2010]. 2) We next consider the Max-Sum Dispersion problem, whose objective is to select k out of n elements from a database that maximizes the dispersion, which is defined as the sum of the pairwise distances under a given metric. We give a quasipolynomial-time approximation scheme (QPTAS) for the problem which runs in time n^{O_ε(log n)}. This improves upon previously known polynomial-time algorithms with approximate ratios 0.5 [Refael Hassin et al., 1997; Allan Borodin et al., 2017]. Furthermore, we observe that reductions from previous work rule out approximation schemes that run in n^õ_ε(log n) time assuming ETH. 3) Finally, we consider a generalization of Max-Sum Dispersion called Max-Sum Diversification. In addition to the sum of pairwise distance, the objective also includes another function f. For monotone submodular function f, we give a quasipolynomial-time algorithm with approximation ratio arbitrarily close to (1-1/e). This improves upon the best polynomial-time algorithm which has approximation ratio 0.5 [Allan Borodin et al., 2017]. Furthermore, the (1-1/e) factor is also tight as achieving better-than-(1-1/e) approximation is NP-hard [Uriel Feige, 1998].</abstract>

<fullTextUrl format="pdf">https://drops.dagstuhl.de/storage/00lipics/lipics-vol229-icalp2022/LIPIcs.ICALP.2022.7/LIPIcs.ICALP.2022.7.pdf</fullTextUrl>

<keyword>Approximation Algorithms</keyword>

<keyword>Complexity</keyword>

<keyword>Data Mining</keyword>

<keyword>Diversification</keyword>

</keywords>

</record>

</records>