The Communication Complexity of Set Intersection Under Product Distributions

Authors Rotem Oshman, Tal Roth



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2023.95.pdf
  • Filesize: 0.74 MB
  • 20 pages

Document Identifiers

Author Details

Rotem Oshman
  • Tel-Aviv University, Israel
Tal Roth
  • Tel-Aviv University, Israel

Cite As Get BibTex

Rotem Oshman and Tal Roth. The Communication Complexity of Set Intersection Under Product Distributions. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 95:1-95:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.ICALP.2023.95

Abstract

We consider a multiparty setting where k parties have private inputs X_1,…,X_k ⊆ [n] and wish to compute the intersection ⋂_{𝓁 =1}^k X_𝓁 of their sets, using as little communication as possible. This task generalizes the well-known problem of set disjointness, where the parties are required only to determine whether the intersection is empty or not. In the worst-case, it is known that the communication complexity of finding the intersection is the same as that of solving set disjointness, regardless of the size of the intersection: the cost of both problems is Ω(n log k + k) bits in the shared blackboard model, and Ω (nk) bits in the coordinator model.
In this work we consider a realistic setting where the parties' inputs are independent of one another, that is, the input is drawn from a product distribution. We show that this makes finding the intersection significantly easier than in the worst-case: only Θ̃((n^{1-1/k} (H(S) + 1)^{1/k}) + k) bits of communication are required, where {H}(S) is the Shannon entropy of the intersection S. We also show that the parties do not need to know the exact underlying input distribution; if we are given in advance O(n^{1/k}) samples from the underlying distribution μ, we can learn enough about μ to allow us to compute the intersection of an input drawn from μ using expected communication Θ̃((n^{1-1/k}𝔼[|S|]^{1/k}) + k), where |S| is the size of the intersection.

Subject Classification

ACM Subject Classification
  • Theory of computation → Communication complexity
Keywords
  • Communication complexity
  • intersection
  • set disjointness

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137-147, 1999. Google Scholar
  2. László Babai, Peter Frankl, and Janos Simon. Complexity classes in communication complexity theory. In 27th Annual Symposium on Foundations of Computer Science (sfcs 1986), pages 337-347, 1986. Google Scholar
  3. Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. An information statistics approach to data stream and communication complexity. In 43rd Symposium on Foundations of Computer Science (FOCS 2002), pages 209-218, 2002. Google Scholar
  4. Anup Bhattacharya, Sourav Chakraborty, Arijit Ghosh, Gopinath Mishra, and Manaswi Paraashar. Disjointness through the lens of vapnik-chervonenkis dimension: Sparsity and beyond. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2020, volume 176 of LIPIcs, pages 23:1-23:15, 2020. Google Scholar
  5. Ralph Bottesch, Dmitry Gavinsky, and Hartmut Klauck. Correlation in hard distributions in communication complexity. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM, volume 40 of LIPIcs, pages 544-572, 2015. Google Scholar
  6. Mark Braverman, Faith Ellen, Rotem Oshman, Toniann Pitassi, and Vinod Vaikuntanathan. A tight bound for set disjointness in the message-passing model. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, pages 668-677, 2013. Google Scholar
  7. Mark Braverman, Ankit Garg, Denis Pankratov, and Omri Weinstein. From information to exact communication. In STOC, pages 151-160. ACM, 2013. Google Scholar
  8. Mark Braverman and Rotem Oshman. On information complexity in the broadcast model. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC 2015, pages 355-364, 2015. Google Scholar
  9. Mark Braverman and Rotem Oshman. A rounds vs. communication tradeoff for multi-party set disjointness. In 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2017, pages 144-155, 2017. Google Scholar
  10. Joshua Brody, Amit Chakrabarti, Ranganath Kondapally, David P. Woodruff, and Grigory Yaroslavtsev. Beyond set disjointness: The communication complexity of finding the intersection. In Proceedings of the 2014 ACM Symposium on Principles of Distributed Computing, PODC '14, pages 106-113, 2014. Google Scholar
  11. Arkadev Chattopadhyay and Toniann Pitassi. The story of set disjointness. ACM SIGACT News, 41(3):59-85, 2010. Google Scholar
  12. Nachum Dershowitz, Rotem Oshman, and Tal Roth. The communication complexity of multiparty set disjointness under product distributions. In STOC, pages 1194-1207. ACM, 2021. Google Scholar
  13. Dmitry Gavinsky. The communication complexity of the inevitable intersection problem. Chic. J. Theor. Comput. Sci., 2020, 2020. Google Scholar
  14. Badih Ghazi, Ben Kreuter, Ravi Kumar, Pasin Manurangsi, Jiayu Peng, Evgeny Skvortsov, Yao Wang, and Craig Wright. Multiparty reach and frequency histogram: Private, secure, and practical. Proc. Priv. Enhancing Technol., 2022(1):373-395, 2022. URL: https://doi.org/10.2478/popets-2022-0019.
  15. André Gronemeier. Asymptotically optimal lower bounds on the nih-multi-party information complexity of the and-function and disjointness. In 26th International Symposium on Theoretical Aspects of Computer Science, STACS 2009, volume 3 of LIPIcs, pages 505-516, 2009. Google Scholar
  16. Dirk Van Gucht, Ryan Williams, David P. Woodruff, and Qin Zhang. The communication complexity of distributed set-joins with applications to matrix multiplication. In Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS, pages 199-212, 2015. URL: https://doi.org/10.1145/2745754.2745779.
  17. Dawei Huang, Seth Pettie, Yixiang Zhang, and Zhijun Zhang. The communication complexity of set intersection and multiple equality testing. SIAM J. Comput., 50(2):674-717, 2021. Google Scholar
  18. Mihaela Ion, Ben Kreuter, Ahmet Erhan Nergiz, Sarvar Patel, Shobhit Saxena, Karn Seth, Mariana Raykova, David Shanahan, and Moti Yung. On deploying secure computing: Private intersection-sum-with-cardinality. In IEEE European Symposium on Security and Privacy, EuroS&P, pages 370-389. IEEE, 2020. URL: https://doi.org/10.1109/EuroSP48549.2020.00031.
  19. Ivo Kubjas and Vitaly Skachek. Two-party function computation on the reconciled data. In 55th Annual Allerton Conference on Communication, Control, and Computing, pages 390-396. IEEE, 2017. URL: https://doi.org/10.1109/ALLERTON.2017.8262764.
  20. Nan Ma and Prakash Ishwar. Two-terminal distributed source coding with alternating messages for function computation. In 2008 IEEE International Symposium on Information Theory, pages 51-55. IEEE, 2008. Google Scholar
  21. Nan Ma and Prakash Ishwar. Infinite-message distributed source coding for two-terminal interactive computing. In 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 1510-1517. IEEE, 2009. Google Scholar
  22. Nan Ma and Prakash Ishwar. Some results on distributed source coding for interactive function computation. IEEE Transactions on Information Theory, 57(9):6180-6195, 2011. Google Scholar
  23. Jeff M. Phillips, Elad Verbin, and Qin Zhang. Lower bounds for number-in-hand multiparty communication complexity, made easy. SIAM J. Comput., 45(1):174-196, 2016. Google Scholar
  24. Alexander A Razborov. On the distributional complexity of disjointness. In International Colloquium on Automata, Languages, and Programming, pages 249-253, 1990. Google Scholar
  25. Mert Saglam and Gábor Tardos. On the communication complexity of sparse set disjointness and exists-equal problems. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, pages 678-687, 2013. Google Scholar
  26. Georg Schnitger and Bala Kalyanasundaram. The probabilistic communication complexity of set intersection. In Proceedings of the Second Annual Conference on Structure in Complexity Theory 1987, 1987. Google Scholar
  27. Alexander A Sherstov. Communication complexity theory: Thirty-five years of set disjointness. In International Symposium on Mathematical Foundations of Computer Science, pages 24-43, 2014. Google Scholar
  28. Gregory Valiant and Paul Valiant. A CLT and tight lower bounds for estimating entropy. Electron. Colloquium Comput. Complex., TR10-183, 2010. URL: https://eccc.weizmann.ac.il/report/2010/183.
  29. Thomas Watson. Communication complexity with small advantage. Comput. Complex., 29(1):2, 2020. Google Scholar
  30. David P. Woodruff and Qin Zhang. When distributed computation is communication expensive. In Distributed Computing: 27th International Symposium, DISC 2013, pages 16-30, 2013. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail