Efficient Algorithms for k-Regret Minimizing Sets

Authors Pankaj K. Agarwal, Nirman Kumar, Stavros Sintos, Subhash Suri



PDF
Thumbnail PDF

File

LIPIcs.SEA.2017.7.pdf
  • Filesize: 3.58 MB
  • 23 pages

Document Identifiers

Author Details

Pankaj K. Agarwal
Nirman Kumar
Stavros Sintos
Subhash Suri

Cite AsGet BibTex

Pankaj K. Agarwal, Nirman Kumar, Stavros Sintos, and Subhash Suri. Efficient Algorithms for k-Regret Minimizing Sets. In 16th International Symposium on Experimental Algorithms (SEA 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 75, pp. 7:1-7:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)
https://doi.org/10.4230/LIPIcs.SEA.2017.7

Abstract

A regret minimizing set Q is a small size representation of a much larger database P so that user queries executed on Q return answers whose scores are not much worse than those on the full dataset. In particular, a k-regret minimizing set has the property that the regret ratio between the score of the top-1 item in Q and the score of the top-k item in P is minimized, where the score of an item is the inner product of the item's attributes with a user's weight (preference) vector. The problem is challenging because we want to find a single representative set Q whose regret ratio is small with respect to all possible user weight vectors. We show that k-regret minimization is NP-Complete for all dimensions d>=3, settling an open problem from Chester et al. [VLDB 2014]. Our main algorithmic contributions are two approximation algorithms, both with provable guarantees, one based on coresets and another based on hitting sets. We perform extensive experimental evaluation of our algorithms, using both real-world and synthetic data, and compare their performance against the solution proposed in [VLDB 14]. The results show that our algorithms are significantly faster and scalable to much larger sets than the greedy algorithm of Chester et al. for comparable quality answers.
Keywords
  • regret minimizing sets
  • skyline
  • top-k query
  • coreset
  • hitting set

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan. Approximating extent measures of points. Journal of the ACM (JACM), 51(4):606-635, 2004. Google Scholar
  2. P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan. Geometric approximation via coresets. Combinatorial and computational geometry, 52:1-30, 2005. Google Scholar
  3. P. K. Agarwal and M. Sharir. Arrangements and their applications. Handbook of computational geometry, pages 49-119, 2000. Google Scholar
  4. A. Asudeh, A. Nazi, N. Zhang, and G. Das. Efficient computation of regret-ratio minimizing set: A compact maxima representative. In Proc. SIGMOD, 2017. To appear. Google Scholar
  5. I. Bartolini, P. Ciaccia, and M. Patella. Efficient sort-based skyline evaluation. ACM Transactions on Database Systems (TODS), 33(4):31, 2008. Google Scholar
  6. S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In Proc. 17th Int. Conf. Data Eng., pages 421-430, 2001. Google Scholar
  7. H. Brönnimann and M. T. Goodrich. Almost optimal set covers in finite vc-dimension. Discrete & Computational Geometry, 14(4):463-479, 1995. Google Scholar
  8. W. Cao, J. Li, H. Wang, K. Wang, R. Wang, R. Chi-Wing Wong, and W. Zhan. k-regret minimizing set: Efficient algorithms and hardness. In ICDT 2017-20th International Conference on Database Theory, pages 11:1-11:19, 2017. Google Scholar
  9. I. Catallo, E. Ciceri, P. Fraternali, D. Martinenghi, and M. Tagliasacchi. Top-k diversity queries over bounded regions. ACM Transactions on Database Systems (TODS), 38(2):10, 2013. Google Scholar
  10. T. M. Chan. Faster core-set constructions and data stream algorithms in fixed dimensions. In Proceedings of the twentieth annual symposium on Computational geometry, pages 152-159. ACM, 2004. Google Scholar
  11. D. Chen, G.-Z. Sun, and N. Z. Gong. Efficient approximate top-k query algorithm using cube index. In Asia-Pacific Web Conference, pages 155-167. Springer, 2011. Google Scholar
  12. S. Chester, A. Thomo, S. Venkatesh, and S. Whitesides. Computing k-regret minimizing sets. Proceedings of the VLDB Endowment, 7(5):389-400, 2014. Google Scholar
  13. G. Das and M. T. Goodrich. On the complexity of optimization problems for 3-dimensional convex polyhedra and decision trees. Computational Geometry, 8(3):123-137, 1997. Google Scholar
  14. G. Das, D. Gunopulos, N. Koudas, and N. Sarkas. Ad-hoc top-k query answering for data streams. In Proceedings of the 33rd International Conference on Very Large Data Bases, VLDB'07, pages 183-194, 2007. Google Scholar
  15. Z. Gong, G.-Z. Sun, J. Yuan, and Y. Zhong. Efficient top-k query algorithms using k-skyband partition. In International Conference on Scalable Information Systems, pages 288-305. Springer, 2009. Google Scholar
  16. U. Güntzer, W. Balke, and W. Kiessling. Optimizing multi-feature queries for image databases. In Proceedings of the 26th International Conference on Very Large Data Bases, VLDB'00, pages 419-428, 2000. Google Scholar
  17. Inc. Gurobi Optimization. Gurobi optimizer reference manual, 2015. URL: http://www.gurobi.com.
  18. J.-S. Heo, J. Cho, and K.-Y. Whang. The hybrid-layer index: A synergic approach to answering top-k queries in arbitrary subspaces. In 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pages 445-448. IEEE, 2010. Google Scholar
  19. V. Hristidis, N. Koudas, and Y. Papakonstantinou. Prefer: A system for the efficient execution of multi-parametric ranked queries. SIGMOD Rec., 30(2):259-270, May 2001. Google Scholar
  20. I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys (CSUR), 40(4):11, 2008. Google Scholar
  21. S. Jasna and M. J. Pillai. An algorithm for retrieving skyline points based on user specified constraints using the skyline ordering. International Journal of Computer Applications, 104(11), 2014. Google Scholar
  22. T. Kessler Faulkner, W. Brackenbury, and A. Lall. k-regret queries with nonlinear utilities. Proceedings of the VLDB Endowment, 8(13):2098-2109, 2015. Google Scholar
  23. V. Koltun and C. H. Papadimitriou. Approximately dominating representatives. Theor. Comput. Sci., 371(3):148-154, February 2007. Google Scholar
  24. R. D. Kulkarni and B. F. Momin. Skyline computation for frequent queries in update intensive environment. Journal of King Saud University-Computer and Information Sciences, 2015. Google Scholar
  25. R. D. Kulkarni and B. F. Momin. Parallel skyline computation for frequent queries in distributed environment. In Computational Techniques in Information and Communication Technologies (ICCTICT), 2016 International Conference on, pages 374-380. IEEE, 2016. Google Scholar
  26. C. Li, Kevin K. C.-C. Chang, and I. F. Ilyas. Supporting ad-hoc ranking aggregates. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD'06, pages 61-72, 2006. Google Scholar
  27. Q. Liu, Y. Gao, G. Chen, Q. Li, and T. Jiang. On efficient reverse k-skyband query processing. In International Conference on Database Systems for Advanced Applications, pages 544-559. Springer, 2012. Google Scholar
  28. A. Marian, N. Bruno, and L. Gravano. Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst., 29(2):319-362, June 2004. Google Scholar
  29. M. Morse, J. M. Patel, and W. I. Grosky. Efficient continuous skyline computation. Information Sciences, 177(17):3411-3437, 2007. Google Scholar
  30. D. Nanongkai, A. Lall, A. Das Sarma, and K. Makino. Interactive regret minimization. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 109-120. ACM, 2012. Google Scholar
  31. D. Nanongkai, A. D. Sarma, A. Lall, R. J. Lipton, and J. Xu. Regret-minimizing representative databases. Proceedings of the VLDB Endowment, 3(1-2):1114-1124, 2010. Google Scholar
  32. C. H. Papadimitriou and M. Yannakakis. On the approximability of trade-offs and optimal access of web sources. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, FOCS'00, pages 86-92, 2000. Google Scholar
  33. C. H. Papadimitriou and M. Yannakakis. Multiobjective query optimization. In Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS'01, pages 52-59, 2001. Google Scholar
  34. P. Peng and R. C.-W. Wong. Geometry approach for k-regret query. In 2014 IEEE 30th International Conference on Data Engineering, pages 772-783. IEEE, 2014. Google Scholar
  35. S. Rahul and Y. Tao. Efficient top-k indexing via general reductions. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS'16, pages 277-288, 2016. Google Scholar
  36. M. Theobald, G. Weikum, and R. Schenkel. Top-k query evaluation with probabilistic guarantees. In Proceedings of the 300th International Conference on Very Large Data Bases, VLDB'04, pages 648-659, 2004. Google Scholar
  37. E. Tiakas, G. Valkanas, A. N. Papadopoulos, Y. Manolopoulos, and D. Gunopulos. Processing top-k dominating queries in metric spaces. ACM Trans. Database Syst., 40(4):23:1-23:38, January 2016. Google Scholar
  38. S. Vassilvitskii and M. Yannakakis. Efficiently computing succinct trade-off curves. Theor. Comput. Sci., 348(2):334-356, December 2005. Google Scholar
  39. A. Vlachou, C. Doulkeridis, Y. Kotidis, and K. Nørvåg. Reverse top-k queries. In 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pages 365-376. IEEE, 2010. Google Scholar
  40. D. Xin, C. Chen, and J. Han. Towards robust indexing for ranked queries. In Proceedings of the 32Nd International Conference on Very Large Data Bases, VLDB'06, pages 235-246, 2006. Google Scholar
  41. A. Yu, P. K. Agarwal, and J. Yang. Processing a large number of continuous preference top-k queries. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 397-408. ACM, 2012. Google Scholar
  42. A. Yu, P. K. Agarwal, and J. Yang. Top-k preferences in high dimensions. IEEE Trans. Knowl. Data Eng., 28(2):311-325, 2016. Google Scholar
  43. Z. Zhang, S. w. Hwang, K. C.-C. Chang, M. Wang, C. A. Lang, and Y. c. Chang. Boolean + ranking: Querying a database by k-constrained optimization. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, SIGMOD'06, pages 359-370, 2006. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail