Optimal Oblivious Algorithms for Multi-Way Joins

Authors Xiao Hu , Zhiang Wu



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2025.25.pdf
  • Filesize: 0.89 MB
  • 19 pages

Document Identifiers

Author Details

Xiao Hu
  • University of Waterloo, Canada
Zhiang Wu
  • University of Waterloo, Canada

Cite As Get BibTex

Xiao Hu and Zhiang Wu. Optimal Oblivious Algorithms for Multi-Way Joins. In 28th International Conference on Database Theory (ICDT 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 328, pp. 25:1-25:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025) https://doi.org/10.4230/LIPIcs.ICDT.2025.25

Abstract

In cloud databases, cloud computation over sensitive data uploaded by clients inevitably causes concern about data security and privacy. Even if cryptographic primitives and trusted computing environments are integrated into query processing to safeguard the actual contents of the data, access patterns of algorithms can still leak private information about data. Oblivious RAM (ORAM) and circuits are two generic approaches to address this issue, ensuring that access patterns of algorithms remain oblivious to the data. However, deploying these methods on insecure algorithms, particularly for multi-way join processing, is computationally expensive and inherently challenging. 
In this paper, we propose a novel sorting-based algorithm for multi-way join processing that operates without relying on ORAM simulations or other security assumptions. Our algorithm is a non-trivial, provably oblivious composition of basic primitives, with time complexity matching the insecure worst-case optimal join algorithm, up to a logarithmic factor. Furthermore, it is cache-agnostic, with cache complexity matching the insecure lower bound, also up to a logarithmic factor. This clean and straightforward approach has the potential to be extended to other security settings and implemented in practical database systems.

Subject Classification

ACM Subject Classification
  • Security and privacy → Management and querying of encrypted data
  • Information systems → Join algorithms
Keywords
  • oblivious algorithms
  • multi-way joins
  • worst-case optimality

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. URL: https://arxiv.org/pdf/2501.04216.
  2. Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of databases, volume 8. Addison-Wesley Reading, 1995. Google Scholar
  3. Mahmoud Abo Khamis, Hung Q Ngo, and Atri Rudra. Faq: questions asked frequently. In PODS, pages 13-28, 2016. URL: https://doi.org/10.1145/2902251.2902280.
  4. Alok Aggarwal and S Vitter, Jeffrey. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116-1127, 1988. URL: https://doi.org/10.1145/48529.48535.
  5. Miklós Ajtai, János Komlós, and Endre Szemerédi. An 0 (n log n) sorting network. In STOC, pages 1-9, 1983. Google Scholar
  6. Arvind Arasu and Raghav Kaushik. Oblivious query processing. ICDT, 2013. Google Scholar
  7. Lars Arge, Michael A Bender, Erik D Demaine, Bryan Holland-Minkley, and J Ian Munro. An optimal cache-oblivious priority queue and its application to graph algorithms. SIAM Journal on Computing, 36(6):1672-1695, 2007. URL: https://doi.org/10.1137/S0097539703428324.
  8. Gilad Asharov, Ilan Komargodski, Wei-Kai Lin, Kartik Nayak, Enoch Peserico, and Elaine Shi. Optorama: Optimal oblivious ram. In Eurocrypt, pages 403-432. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-45724-2_14.
  9. Albert Atserias, Martin Grohe, and Dániel Marx. Size bounds and query plans for relational joins. In FOCS, pages 739-748. IEEE, 2008. URL: https://doi.org/10.1109/FOCS.2008.43.
  10. Kenneth E Batcher. Sorting networks and their applications. In Proceedings of the April 30-May 2, 1968, spring joint computer conference, pages 307-314, 1968. URL: https://doi.org/10.1145/1468075.1468121.
  11. Paul Beame, Paraschos Koutris, and Dan Suciu. Communication steps for parallel query processing. JACM, 64(6):1-58, 2017. URL: https://doi.org/10.1145/3125644.
  12. C. Beeri, R. Fagin, D. Maier, and M. Yannakakis. On the desirability of acyclic database schemes. JACM, 30(3):479-513, 1983. URL: https://doi.org/10.1145/2402.322389.
  13. Amos Beimel, Kobbi Nissim, and Mohammad Zaheri. Exploring differential obliviousness. In APPROX/RANDOM, 2019. Google Scholar
  14. TH Chan, Kai-Min Chung, Wei-Kai Lin, and Elaine Shi. Mpc for mpc: secure computation on a massively parallel computing architecture. In ITCS, 2020. Google Scholar
  15. TH Hubert Chan, Kai-Min Chung, Bruce M Maggs, and Elaine Shi. Foundations of differentially oblivious algorithms. In SODA, pages 2448-2467. SIAM, 2019. Google Scholar
  16. Zhao Chang, Dong Xie, and Feifei Li. Oblivious ram: A dissection and experimental evaluation. Proc. VLDB Endow., 9(12):1113-1124, 2016. URL: https://doi.org/10.14778/2994509.2994528.
  17. Zhao Chang, Dong Xie, Sheng Wang, and Feifei Li. Towards practical oblivious join. In SIGMOD, 2022. Google Scholar
  18. Shumo Chu, Danyang Zhuo, Elaine Shi, and T-H. Hubert Chan. Differentially Oblivious Database Joins: Overcoming the Worst-Case Curse of Fully Oblivious Algorithms. In ITC, volume 199, pages 19:1-19:24, 2021. URL: https://doi.org/10.4230/LIPICS.ITC.2021.19.
  19. Victor Costan and Srinivas Devadas. Intel sgx explained. Cryptology ePrint Archive, 2016. Google Scholar
  20. Natacha Crooks, Matthew Burke, Ethan Cecchetti, Sitar Harel, Rachit Agarwal, and Lorenzo Alvisi. Obladi: Oblivious serializable transactions in the cloud. In OSDI, pages 727-743, 2018. URL: https://www.usenix.org/conference/osdi18/presentation/crooks.
  21. Erik D Demaine. Cache-oblivious algorithms and data structures. Lecture Notes from the EEF Summer School on Massive Data Sets, 8(4):1-249, 2002. Google Scholar
  22. Shiyuan Deng and Yufei Tao. Subgraph enumeration in optimal i/o complexity. In ICDT. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024. Google Scholar
  23. Srinivas Devadas, Marten van Dijk, Christopher W Fletcher, Ling Ren, Elaine Shi, and Daniel Wichs. Onion oram: A constant bandwidth blowup oblivious ram. In TCC, pages 145-174. Springer, 2016. URL: https://doi.org/10.1007/978-3-662-49099-0_6.
  24. Saba Eskandarian and Matei Zaharia. Oblidb: Oblivious query processing for secure databases. Proc. VLDB Endow., 13(2), 2019. URL: https://doi.org/10.14778/3364324.3364331.
  25. R. Fagin. Degrees of acyclicity for hypergraphs and relational database schemes. JACM, 30(3):514-550, 1983. URL: https://doi.org/10.1145/2402.322390.
  26. Austen Z Fan, Paraschos Koutris, and Hangdong Zhao. Tight bounds of circuits for sum-product queries. SIGMOD, 2(2):1-20, 2024. Google Scholar
  27. Jörg Flum, Markus Frick, and Martin Grohe. Query evaluation via tree-decompositions. JACM, 49(6):716-752, 2002. URL: https://doi.org/10.1145/602220.602222.
  28. Matteo Frigo, Charles E Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In FOCS, pages 285-297. IEEE, 1999. URL: https://doi.org/10.1109/SFFCS.1999.814600.
  29. Craig Gentry, Kenny A Goldman, Shai Halevi, Charanjit Julta, Mariana Raykova, and Daniel Wichs. Optimizing oram and using it efficiently for secure computation. In PETs, pages 1-18. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-39077-7_1.
  30. Oded Goldreich. Towards a theory of software protection and simulation by oblivious rams. In STOC, pages 182-194, 1987. URL: https://doi.org/10.1145/28395.28416.
  31. Oded Goldreich and Rafail Ostrovsky. Software protection and simulation on oblivious rams. JACM, 43(3):431-473, 1996. URL: https://doi.org/10.1145/233551.233553.
  32. Michael T Goodrich. Data-oblivious external-memory algorithms for the compaction, selection, and sorting of outsourced data. In SPAA, pages 379-388, 2011. URL: https://doi.org/10.1145/1989493.1989555.
  33. Georg Gottlob, Nicola Leone, and Francesco Scarcello. Hypertree decompositions and tractable queries. JCSS, 64(3):579-627, 2002. URL: https://doi.org/10.1006/JCSS.2001.1809.
  34. Hakan Hacigümüş, Bala Iyer, Chen Li, and Sharad Mehrotra. Executing sql over encrypted data in the database-service-provider model. In SIGMOD, pages 216-227, 2002. Google Scholar
  35. Bingsheng He and Qiong Luo. Cache-oblivious nested-loop joins. In CIKM, pages 718-727, 2006. URL: https://doi.org/10.1145/1183614.1183717.
  36. Xiao Hu. Cover or pack: New upper and lower bounds for massively parallel joins. In PODS, pages 181-198, 2021. URL: https://doi.org/10.1145/3452021.3458319.
  37. Xiaocheng Hu, Miao Qiao, and Yufei Tao. I/o-efficient join dependency testing, loomis-whitney join, and triangle enumeration. JCSS, 82(8):1300-1315, 2016. URL: https://doi.org/10.1016/J.JCSS.2016.05.005.
  38. Bas Ketsman and Dan Suciu. A worst-case optimal multi-round algorithm for parallel computation of conjunctive queries. In PODS, pages 417-428, 2017. URL: https://doi.org/10.1145/3034786.3034788.
  39. Paraschos Koutris, Paul Beame, and Dan Suciu. Worst-case optimal algorithms for parallel query processing. In ICDT. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. Google Scholar
  40. Simeon Krastnikov, Florian Kerschbaum, and Douglas Stebila. Efficient oblivious database joins. VLDB, 13(12):2132-2145, 2020. URL: http://www.vldb.org/pvldb/vol13/p2132-krastnikov.pdf.
  41. Eyal Kushilevitz, Steve Lu, and Rafail Ostrovsky. On the (in) security of hash-based oblivious ram and a new balancing scheme. In SODA, pages 143-156. SIAM, 2012. URL: https://doi.org/10.1137/1.9781611973099.13.
  42. Wei-Kai Lin, Elaine Shi, and Tiancheng Xie. Can we overcome the n log n barrier for oblivious sorting? In SODA, pages 2419-2438. SIAM, 2019. URL: https://doi.org/10.1137/1.9781611975482.148.
  43. Hung Q Ngo, Ely Porat, Christopher Ré, and Atri Rudra. Worst-case optimal join algorithms. JACM, 65(3):1-40, 2018. URL: https://doi.org/10.1145/3180143.
  44. Hung Q Ngo, Christopher Ré, and Atri Rudra. Skew strikes back: New developments in the theory of join algorithms. ACM SIGMOD Record, 42(4):5-16, 2014. URL: https://doi.org/10.1145/2590989.2590991.
  45. Vijaya Ramachandran and Elaine Shi. Data oblivious algorithms for multicores. In SPAA, pages 373-384, 2021. URL: https://doi.org/10.1145/3409964.3461783.
  46. Sajin Sasy, Aaron Johnson, and Ian Goldberg. Fast fully oblivious compaction and shuffling. In CCS, pages 2565-2579, 2022. URL: https://doi.org/10.1145/3548606.3560603.
  47. Elaine Shi. Path oblivious heap: Optimal and practical oblivious priority queue. In SP, pages 842-858. IEEE, 2020. URL: https://doi.org/10.1109/SP40000.2020.00037.
  48. Emil Stefanov, Marten Van Dijk, Elaine Shi, T-H Hubert Chan, Christopher Fletcher, Ling Ren, Xiangyao Yu, and Srinivas Devadas. Path oram: an extremely simple oblivious ram protocol. JACM, 65(4):1-26, 2018. URL: https://doi.org/10.1145/3177872.
  49. Yufei Tao, Ru Wang, and Shiyuan Deng. Parallel communication obliviousness: One round and beyond. Proceedings of the ACM on Management of Data, 2(5):1-24, 2024. URL: https://doi.org/10.1145/3695832.
  50. Todd L Veldhuizen. Leapfrog triejoin: A simple, worst-case optimal join algorithm. In ICDT, 2014. Google Scholar
  51. Jeffrey Scott Vitter. External memory algorithms and data structures: Dealing with massive data. CsUR, 33(2):209-271, 2001. Google Scholar
  52. Xiao Wang, Hubert Chan, and Elaine Shi. Circuit oram: On tightness of the goldreich-ostrovsky lower bound. In CCS, pages 850-861, 2015. URL: https://doi.org/10.1145/2810103.2813634.
  53. Yilei Wang and Ke Yi. Query evaluation by circuits. In PODS, 2022. Google Scholar
  54. Mihalis Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82-94, 1981. Google Scholar
  55. Wenting Zheng, Ankur Dave, Jethro G Beekman, Raluca Ada Popa, Joseph E Gonzalez, and Ion Stoica. Opaque: An oblivious and encrypted distributed analytics platform. In NSDI 17, pages 283-298, 2017. URL: https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/zheng.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail