Locality Sensitive Hashing for Set-Queries, Motivated by Group Recommendations

Authors Haim Kaplan, Jay Tenenbaum

Thumbnail PDF


  • Filesize: 0.81 MB
  • 18 pages

Document Identifiers

Author Details

Haim Kaplan
  • School of Computer Science, Tel Aviv University, Israel
Jay Tenenbaum
  • School of Computer Science, Tel Aviv University, Israel


We want to thank Prof. Micha Sharir and Prof. Edith Cohen for the fruitful discussions.

Cite AsGet BibTex

Haim Kaplan and Jay Tenenbaum. Locality Sensitive Hashing for Set-Queries, Motivated by Group Recommendations. In 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 162, pp. 28:1-28:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Locality Sensitive Hashing (LSH) is an effective method to index a set of points such that we can efficiently find the nearest neighbors of a query point. We extend this method to our novel Set-query LSH (SLSH), such that it can find the nearest neighbors of a set of points, given as a query. Let s(x,y) be the similarity between two points x and y. We define a similarity between a set Q and a point x by aggregating the similarities s(p,x) for all p∈ Q. For example, we can take s(p,x) to be the angular similarity between p and x (i.e., 1-(∠(x,p)/π)), and aggregate by arithmetic or geometric averaging, or taking the lowest similarity. We develop locality sensitive hash families and data structures for a large set of such arithmetic and geometric averaging similarities, and analyze their collision probabilities. We also establish an analogous framework and hash families for distance functions. Specifically, we give a structure for the euclidean distance aggregated by either averaging or taking the maximum. We leverage SLSH to solve a geometric extension of the approximate near neighbors problem. In this version, we consider a metric for which the unit ball is an ellipsoid and its orientation is specified with the query. An important application that motivates our work is group recommendation systems. Such a system embeds movies and users in the same feature space, and the task of recommending a movie for a group to watch together, translates to a set-query Q using an appropriate similarity.

Subject Classification

ACM Subject Classification
  • Theory of computation → Computational geometry
  • Theory of computation → Data structures design and analysis
  • Information systems → Information retrieval
  • Locality sensitive hashing
  • nearest neighbors
  • similarity search
  • group recommendations
  • distance functions
  • similarity functions
  • ellipsoid


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Sihem Amer-Yahia, Senjuti Basu Roy, Ashish Chawlat, Gautam Das, and Cong Yu. Group recommendation: Semantics and efficiency. Proceedings of the VLDB Endowment, 2(1):754-765, 2009. URL: https://doi.org/10.14778/1687627.1687713.
  2. Alexandr Andoni and Piotr Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS, pages 459-468. IEEE, 2006. URL: https://doi.org/10.1109/FOCS.2006.49.
  3. Alexandr Andoni and Ilya Razenshteyn. Optimal data-dependent hashing for approximate near neighbors. In STOC, pages 793-801. ACM, 2015. URL: https://doi.org/10.1145/2746539.2746553.
  4. Alexandr Andoni and Ilya Razenshteyn. Tight lower bounds for data-dependent locality-sensitive hashing. In SOCG, pages 1-11. ACM, 2016. URL: https://doi.org/10.4230/LIPIcs.SoCG.2016.9.
  5. Liliana Ardissono, Anna Goy, Giovanna Petrone, Marino Segnan, and Pietro Torasso. Tailoring the recommendation of tourist information to heterogeneous user groups. In Workshop on adaptive hypermedia, pages 280-295. Springer, 2001. URL: https://doi.org/10.1007/3-540-45844-1_26.
  6. Yoram Bachrach, Yehuda Finkelstein, Ran Gilad-Bachrach, Liran Katzir, Noam Koenigstein, Nir Nice, and Ulrich Paquet. Speeding up the xbox recommender system using a euclidean transformation for inner-product spaces. In Proceedings of the 8th ACM Conference on Recommender systems, pages 257-264. ACM, 2014. URL: https://doi.org/10.1145/2645710.2645741.
  7. Andrei Z Broder. On the resemblance and containment of documents. In Compression and complexity of sequences, pages 21-29. IEEE, 1997. URL: https://doi.org/10.1109/SEQUEN.1997.666900.
  8. Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In SOCG, pages 253-262. ACM, 2004. URL: https://doi.org/10.1145/997817.997857.
  9. Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al. Similarity search in high dimensions via hashing. In VLDB, pages 518-529, 1999. URL: http://www.vldb.org/conf/1999/P49.pdf.
  10. Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of computing, 8(1):321-350, 2012. URL: https://doi.org/10.4086/toc.2012.v008a014.
  11. Piotr Indyk and Rajeev Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In STOC, pages 604-613. ACM, 1998. URL: https://doi.org/10.1145/276698.276876.
  12. Prateek Jain, Sudheendra Vijayanarasimhan, and Kristen Grauman. Hashing hyperplane queries to near points with applications to large-scale active learning. Transactions on Pattern Analysis and Machine Intelligence, pages 276-288, 2014. URL: https://doi.org/10.1109/TPAMI.2013.121.
  13. Anthony Jameson. More than the sum of its members: challenges for group recommender systems. In AVI, pages 48-54. ACM, 2004. URL: https://doi.org/10.1145/989863.989869.
  14. Anthony Jameson and Barry Smyth. Recommendation to groups. In The adaptive web, pages 596-627. Springer, 2007. URL: https://doi.org/10.1007/978-3-540-72079-9_20.
  15. Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30-37, 2009. URL: https://doi.org/10.1109/MC.2009.263.
  16. Qin Lv, William Josephson, Zhe Wang, Moses Charikar, and Kai Li. Multi-probe lsh: efficient indexing for high-dimensional similarity search. In VLDB, pages 950-961, 2007. URL: http://www.vldb.org/conf/2007/papers/research/p950-lv.pdf.
  17. Judith Masthoff. Group modeling: Selecting a sequence of television items to suit a group of viewers. In Personalized digital television, pages 93-141. Springer, 2004. URL: https://doi.org/10.1023/B:USER.0000010138.79319.fd.
  18. Kevin McCarthy, Lorraine McGinty, Barry Smyth, and Maria Salamó. The needs of the many: a case-based group recommender system. In ECCBR, pages 196-210. Springer, 2006. URL: https://doi.org/10.1007/11805816_16.
  19. Rajeev Motwani, Assaf Naor, and Rina Panigrahi. Lower bounds on locality sensitive hashing. In SOCG, pages 154-157. ACM, 2006. URL: https://doi.org/10.1145/1137856.1137881.
  20. Behnam Neyshabur and Nathan Srebro. On symmetric and asymmetric lshs for inner product search. In ICML, pages 1926-1934, 2015. URL: http://proceedings.mlr.press/v37/neyshabur15.html.
  21. Mark O’connor, Dan Cosley, Joseph A Konstan, and John Riedl. Polylens: a recommender system for groups of users. In ECSCW, pages 199-218. Springer, 2001. URL: https://doi.org/10.1007/0-306-48019-0_11.
  22. Anshumali Shrivastava and Ping Li. Asymmetric lsh (alsh) for sublinear time maximum inner product search (mips). In NIPS, pages 2321-2329, 2014. URL: http://arxiv.org/abs/1405.5869.
  23. Zhiwen Yu, Xingshe Zhou, Yanbin Hao, and Jianhua Gu. Tv program recommendation for multiple viewers based on user profile merging. UMUAI, 16(1):63-82, 2006. URL: https://doi.org/10.1007/s11257-006-9005-6.
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail