Support Testing in the Huge Object Model

Adar, Tomer; Fischer, Eldar; Levi, Amit

doi:10.4230/LIPIcs.APPROX/RANDOM.2024.46

Abstract

The Huge Object model is a distribution testing model in which we are given access to independent samples from an unknown distribution over the set of strings {0,1}ⁿ, but are only allowed to query a few bits from the samples. We investigate the problem of testing whether a distribution is supported on m elements in this model. It turns out that the behavior of this property is surprisingly intricate, especially when also considering the question of adaptivity.
We prove lower and upper bounds for both adaptive and non-adaptive algorithms in the one-sided and two-sided error regime. Our bounds are tight when m is fixed to a constant (and the distance parameter ε is the only variable). For the general case, our bounds are at most O(log m) apart. In particular, our results show a surprising O(log ε^{-1}) gap between the number of queries required for non-adaptive testing as compared to adaptive testing. For one-sided error testing, we also show that an O(log m) gap between the number of samples and the number of queries is necessary. Our results utilize a wide variety of combinatorial and probabilistic methods.

Cite As Get BibTex

Tomer Adar, Eldar Fischer, and Amit Levi. Support Testing in the Huge Object Model. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 46:1-46:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2024.46

Author Details

Tomer Adar

Technion - Israel Institute of Technology, Haifa, Israel

Eldar Fischer

Technion - Israel Institute of Technology, Haifa, Israel

Amit Levi

University of Haifa, Israel

Funding

Fischer, Eldar: Research supported by an Israel Science Foundation grant number 879/22.

References

Tomer Adar and Eldar Fischer. Refining the adaptivity notion in the huge object model. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2024, August 28-30, 2024, London, United Kingdom, volume 317. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024.
Noga Alon. On bipartite coverings of graphs and multigraphs. arXiv preprint, 2023. URL: https://arxiv.org/abs/2307.16784.
Tugkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pages 442-451. IEEE, 2001.
Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D Smith, and Patrick White. Testing that distributions are close. In Proceedings 41st Annual Symposium on Foundations of Computer Science, pages 259-269. IEEE, 2000.
Omri Ben-Eliezer, Eldar Fischer, Amit Levi, and Ron D Rothblum. Hard properties with (very) short pcpps and their applications. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020), 2020.
Oded Goldreich. Introduction to property testing. Cambridge University Press, 2017.
Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM, 45(4):653-750, 1998.
Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation, pages 68-75, 2011.
Oded Goldreich and Dana Ron. Testing distributions of huge objects. TheoretiCS, 2, 2023.
Georges Hansel. Nombre minimal de contacts de fermeture nécessaires pour réaliser unefonction booléenne symétrique de n variables. COMPTES RENDUS HEBDOMADAIRES DES SEANCES DE L ACADEMIE DES SCIENCES, 258(25):6037, 1964.
Gyula Katona and Endre Szemerédi. On a problem of graph theory. Studia Scientiarum Mathematicarum Hungarica, 2:2328, 1967.
Ronitt Rubinfeld and Madhu Sudan. Robust characterization of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252-271, 1996.
Gregory Valiant and Paul Valiant. Estimating the unseen: an n/log (n)-sample estimator for entropy and support size, shown optimal via new clts. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 685-694, 2011.
Gregory Valiant and Paul Valiant. Estimating the unseen: improved estimators for entropy and other properties. Journal of the ACM (JACM), 64(6):1-41, 2017.
Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. The Annals of Statistics, 47(2):857-883, 2019.
Andrew Chi-Chin Yao. Probabilistic computations: Toward a unified measure of complexity. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science, pages 222-227, 1977.

Support Testing in the Huge Object Model

Authors Tomer Adar , Eldar Fischer, Amit Levi

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Support Testing in the Huge Object Model

Authors Tomer Adar , Eldar Fischer, Amit Levi

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message