Approximating Pandora’s Box with Correlations

Authors Shuchi Chawla , Evangelia Gergatsouli , Jeremy McMahan , Christos Tzamos

Thumbnail PDF


  • Filesize: 0.98 MB
  • 24 pages

Document Identifiers

Author Details

Shuchi Chawla
  • University of Texas - Austin, TX, USA
Evangelia Gergatsouli
  • University of Wisconsin - Madison, WI, USA
Jeremy McMahan
  • University of Wisconsin - Madison, WI, USA
Christos Tzamos
  • University of Wisconsin - Madison, WI, USA
  • University of Athens, Greece

Cite AsGet BibTex

Shuchi Chawla, Evangelia Gergatsouli, Jeremy McMahan, and Christos Tzamos. Approximating Pandora’s Box with Correlations. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 275, pp. 26:1-26:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


We revisit the classic Pandora’s Box (PB) problem under correlated distributions on the box values. Recent work of [Shuchi Chawla et al., 2020] obtained constant approximate algorithms for a restricted class of policies for the problem that visit boxes in a fixed order. In this work, we study the complexity of approximating the optimal policy which may adaptively choose which box to visit next based on the values seen so far. Our main result establishes an approximation-preserving equivalence of PB to the well studied Uniform Decision Tree (UDT) problem from stochastic optimization and a variant of the Min-Sum Set Cover (MSSC_f) problem. For distributions of support m, UDT admits a log m approximation, and while a constant factor approximation in polynomial time is a long-standing open problem, constant factor approximations are achievable in subexponential time [Ray Li et al., 2020]. Our main result implies that the same properties hold for PB and MSSC_f. We also study the case where the distribution over values is given more succinctly as a mixture of m product distributions. This problem is again related to a noisy variant of the Optimal Decision Tree which is significantly more challenging. We give a constant-factor approximation that runs in time n^Õ(m²/ε²) when the mixture components on every box are either identical or separated in TV distance by ε.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
  • Pandora’s Box
  • Min Sum Set Cover
  • stochastic optimization
  • approximation preserving reduction


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Marek Adamczyk, Maxim Sviridenko, and Justin Ward. Submodular stochastic probing on matroids. Math. Oper. Res., 41(3):1022-1038, 2016. URL:
  2. Micah Adler and Brent Heeringa. Approximating optimal binary decision trees. Algorithmica, 62(3-4):1112-1121, 2012. URL:
  3. Yossi Azar, Iftah Gamzu, and Xiaoxin Yin. Multiple intents re-ranking. In Michael Mitzenmacher, editor, Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31 - June 2, 2009, pages 669-678. ACM, 2009. URL:
  4. Nikhil Bansal, Anupam Gupta, and Ravishankar Krishnaswamy. A constant factor approximation algorithm for generalized min-sum set cover. In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2010, Austin, Texas, USA, January 17-19, 2010, pages 1539-1545, 2010. URL:
  5. Curtis Bechtel, Shaddin Dughmi, and Neel Patel. Delegated pandora’s box. In David M. Pennock, Ilya Segal, and Sven Seuken, editors, EC '22: The 23rd ACM Conference on Economics and Computation, Boulder, CO, USA, July 11 - 15, 2022, pages 666-693. ACM, 2022. URL:
  6. Ben Berger, Tomer Ezra, Michal Feldman, and Federico Fusco. Pandora’s problem with combinatorial cost. CoRR, abs/2303.01078, 2023. URL:
  7. Hedyeh Beyhaghi and Linda Cai. Pandora’s problem with nonobligatory inspection: Optimal structure and a PTAS. CoRR, abs/2212.01524, 2022. URL:
  8. Hedyeh Beyhaghi and Robert Kleinberg. Pandora’s problem with nonobligatory inspection. In Anna Karlin, Nicole Immorlica, and Ramesh Johari, editors, Proceedings of the 2019 ACM Conference on Economics and Computation, EC 2019, Phoenix, AZ, USA, June 24-28, 2019, pages 131-132. ACM, 2019. URL:
  9. Shant Boodaghians, Federico Fusco, Philip Lazos, and Stefano Leonardi. Pandora’s box problem with order constraints. In Péter Biró, Jason D. Hartline, Michael Ostrovsky, and Ariel D. Procaccia, editors, EC '20: The 21st ACM Conference on Economics and Computation, Virtual Event, Hungary, July 13-17, 2020, pages 439-458. ACM, 2020. URL:
  10. Venkatesan T. Chakaravarthy, Vinayaka Pandit, Sambuddha Roy, Pranjal Awasthi, and Mukesh K. Mohania. Decision trees for entity identification: Approximation algorithms and hardness results. ACM Trans. Algorithms, 7(2):15:1-15:22, 2011. URL:
  11. Venkatesan T. Chakaravarthy, Vinayaka Pandit, Sambuddha Roy, and Yogish Sabharwal. Approximating decision trees with multiway branches. In Susanne Albers, Alberto Marchetti-Spaccamela, Yossi Matias, Sotiris E. Nikoletseas, and Wolfgang Thomas, editors, Automata, Languages and Programming, 36th International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part I, volume 5555 of Lecture Notes in Computer Science, pages 210-221. Springer, 2009. URL:
  12. Moses Charikar, Ronald Fagin, Venkatesan Guruswami, Jon M. Kleinberg, Prabhakar Raghavan, and Amit Sahai. Query strategies for priced information (extended abstract). In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 582-591, 2000. URL:
  13. Shuchi Chawla, Evangelia Gergatsouli, Yifeng Teng, Christos Tzamos, and Ruimin Zhang. Pandora’s box with correlations: Learning and approximation. In 61st IEEE Annual Symposium on Foundations of Computer Science, FOCS 2020, Durham, NC, USA, November 16-19, 2020, pages 1214-1225. IEEE, 2020. URL:
  14. Yuxin Chen, S. Hamed Hassani, Amin Karbasi, and Andreas Krause. Sequential information maximization: When is greedy near-optimal? In Proceedings of The 28th Conference on Learning Theory, COLT 2015, Paris, France, July 3-6, 2015, pages 338-363, 2015. URL:
  15. Yuxin Chen, Shervin Javdani, Amin Karbasi, J. Andrew Bagnell, Siddhartha S. Srinivasa, and Andreas Krause. Submodular surrogates for value of information. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA., pages 3511-3518, 2015. URL:
  16. Ferdinando Cicalese, Tobias Jacobs, Eduardo Sany Laber, and Marco Molinaro. On greedy algorithms for decision trees. In Otfried Cheong, Kyung-Yong Chwa, and Kunsoo Park, editors, Algorithms and Computation - 21st International Symposium, ISAAC 2010, Jeju Island, Korea, December 15-17, 2010, Proceedings, Part II, volume 6507 of Lecture Notes in Computer Science, pages 206-217. Springer, 2010. URL:
  17. Sanjoy Dasgupta. Analysis of a greedy active learning strategy. In Advances in Neural Information Processing Systems 17 [Neural Information Processing Systems, NIPS 2004, December 13-18, 2004, Vancouver, British Columbia, Canada], pages 337-344, 2004. URL:
  18. Bolin Ding, Yiding Feng, Chien-Ju Ho, Wei Tang, and Haifeng Xu. Competitive information design for pandora’s box. In Nikhil Bansal and Viswanath Nagarajan, editors, Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 353-381. SIAM, 2023. URL:
  19. Laura Doval. Whether or not to open pandora’s box. J. Econ. Theory, 175:127-158, 2018. URL:
  20. Hossein Esfandiari, Mohammad Taghi Hajiaghayi, Brendan Lucier, and Michael Mitzenmacher. Online pandora’s boxes and bandits. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 1885-1892. AAAI Press, 2019. URL:
  21. Uriel Feige, László Lovász, and Prasad Tetali. Approximating min sum set cover. Algorithmica, 40(4):219-234, 2004. URL:
  22. Hu Fu, Jiawei Li, and Daogao Liu. Pandora box problem with nonobligatory inspection: Hardness and improved approximation algorithms. CoRR, abs/2207.09545, 2022. URL:
  23. M. R. Garey and Ronald L. Graham. Performance bounds on the splitting algorithm for binary testing. Acta Informatica, 3:347-355, 1974. URL:
  24. Evangelia Gergatsouli and Christos Tzamos. Weitzman’s rule for pandora’s box with correlations. CoRR, abs/2301.13534, 2023. URL:
  25. J.C. Gittins and D.M. Jones. A dynamic allocation index for the sequential design of experiments. Progress in Statistics, pages 241-266, 1974. Google Scholar
  26. Ashish Goel, Sudipto Guha, and Kamesh Munagala. Asking the right questions: model-driven optimization using probes. In Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26-28, 2006, Chicago, Illinois, USA, pages 203-212, 2006. URL:
  27. Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. J. Artif. Intell. Res., 42:427-486, 2011. URL:
  28. Daniel Golovin and Andreas Krause. Adaptive submodularity: A new approach to active learning and stochastic optimization. CoRR, abs/1003.3967, 2017. URL:
  29. Daniel Golovin, Andreas Krause, and Debajyoti Ray. Near-optimal bayesian active learning with noisy observations. In John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta, editors, Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada, pages 766-774. Curran Associates, Inc., 2010. URL:
  30. Andrew Guillory and Jeff A. Bilmes. Average-case active learning with costs. In Ricard Gavaldà, Gábor Lugosi, Thomas Zeugmann, and Sandra Zilles, editors, Algorithmic Learning Theory, 20th International Conference, ALT 2009, Porto, Portugal, October 3-5, 2009. Proceedings, volume 5809 of Lecture Notes in Computer Science, pages 141-155. Springer, 2009. URL:
  31. Anupam Gupta, Haotian Jiang, Ziv Scully, and Sahil Singla. The markovian price of information. In Integer Programming and Combinatorial Optimization - 20th International Conference, IPCO 2019, Ann Arbor, MI, USA, May 22-24, 2019, Proceedings, pages 233-246, 2019. URL:
  32. Anupam Gupta and Amit Kumar. Sorting and selection with structured costs. In 42nd Annual Symposium on Foundations of Computer Science, FOCS 2001, 14-17 October 2001, Las Vegas, Nevada, USA, pages 416-425, 2001. URL:
  33. Anupam Gupta and Viswanath Nagarajan. A stochastic probing problem with applications. In Integer Programming and Combinatorial Optimization - 16th International Conference, IPCO 2013, Valparaíso, Chile, March 18-20, 2013. Proceedings, pages 205-216, 2013. URL:
  34. Anupam Gupta, Viswanath Nagarajan, and R. Ravi. Approximation algorithms for optimal decision trees and adaptive TSP problems. Math. Oper. Res., 42(3):876-896, 2017. URL:
  35. Anupam Gupta, Viswanath Nagarajan, and Sahil Singla. Algorithms and adaptivity gaps for stochastic probing. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1731-1747, 2016. URL:
  36. Anupam Gupta, Viswanath Nagarajan, and Sahil Singla. Adaptivity gaps for stochastic probing: Submodular and XOS functions. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 1688-1702, 2017. URL:
  37. Noam Hazon, Yonatan Aumann, Sarit Kraus, and David Sarne. Physical search problems with probabilistic knowledge. Artif. Intell., 196:26-52, 2013. URL:
  38. Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is np-complete. Inf. Process. Lett., 5(1):15-17, 1976. URL:
  39. Sungjin Im, Viswanath Nagarajan, and Ruben van der Zwaan. Minimum latency submodular cover. ACM Trans. Algorithms, 13(1):13:1-13:28, 2016. URL:
  40. Su Jia, Viswanath Nagarajan, Fatemeh Navidi, and R. Ravi. Optimal decision tree with noisy outcomes. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 3298-3308, 2019. URL:
  41. Prabhanjan Kambadur, Viswanath Nagarajan, and Fatemeh Navidi. Adaptive submodular ranking. In Integer Programming and Combinatorial Optimization - 19th International Conference, IPCO 2017, Waterloo, ON, Canada, June 26-28, 2017, Proceedings, pages 317-329, 2017. URL:
  42. S. Rao Kosaraju, Teresa M. Przytycka, and Ryan S. Borgstrom. On an optimal split tree problem. In Frank K. H. A. Dehne, Arvind Gupta, Jörg-Rüdiger Sack, and Roberto Tamassia, editors, Algorithms and Data Structures, 6th International Workshop, WADS '99, Vancouver, British Columbia, Canada, August 11-14, 1999, Proceedings, volume 1663 of Lecture Notes in Computer Science, pages 157-168. Springer, 1999. URL:
  43. Ray Li, Percy Liang, and Stephen Mussmann. A tight analysis of greedy yields subexponential time approximation for uniform decision tree. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 102-121. SIAM, 2020. URL:
  44. Zhen Liu, Srinivasan Parthasarathy, Anand Ranganathan, and Hao Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In Jason Tsong-Li Wang, editor, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008, pages 133-146. ACM, 2008. URL:
  45. Donald W. Loveland. Performance bounds for binary testing with arbitrary weights. Acta Informatica, 22(1):101-114, 1985. URL:
  46. Kamesh Munagala, Shivnath Babu, Rajeev Motwani, and Jennifer Widom. The pipelined set cover problem. In Thomas Eiter and Leonid Libkin, editors, Database Theory - ICDT 2005, 10th International Conference, Edinburgh, UK, January 5-7, 2005, Proceedings, volume 3363 of Lecture Notes in Computer Science, pages 83-98. Springer, 2005. URL:
  47. Feng Nan and Venkatesh Saligrama. Comments on the proof of adaptive stochastic set cover based on adaptive submodularity and its implications for the group identification problem in "group-based active query selection for rapid diagnosis in time-critical situations". IEEE Trans. Inf. Theory, 63(11):7612-7614, 2017. URL:
  48. Krishna R. Pattipati and Mahesh Dontamsetty. On a generalized test sequencing problem. IEEE Trans. Syst. Man Cybern., 22(2):392-396, 1992. URL:
  49. Vili Podgorelec, Peter Kokol, Bruno Stiglic, and Ivan Rozman. Decision trees: An overview and their use in medicine. Journal of medical systems, 26:445-63, November 2002. URL:
  50. Sahil Singla. The price of information in combinatorial optimization. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 2523-2532, 2018. URL:
  51. Martin Skutella and David P. Williamson. A note on the generalized min-sum set cover problem. Oper. Res. Lett., 39(6):433-436, 2011. URL:
  52. Martin L Weitzman. Optimal Search for the Best Alternative. Econometrica, 47(3):641-654, May 1979. Google Scholar