Support Testing in the Huge Object Model

Authors Tomer Adar , Eldar Fischer, Amit Levi



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2024.46.pdf
  • Filesize: 0.72 MB
  • 16 pages

Document Identifiers

Author Details

Tomer Adar
  • Technion - Israel Institute of Technology, Haifa, Israel
Eldar Fischer
  • Technion - Israel Institute of Technology, Haifa, Israel
Amit Levi
  • University of Haifa, Israel

Cite AsGet BibTex

Tomer Adar, Eldar Fischer, and Amit Levi. Support Testing in the Huge Object Model. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 46:1-46:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2024.46

Abstract

The Huge Object model is a distribution testing model in which we are given access to independent samples from an unknown distribution over the set of strings {0,1}ⁿ, but are only allowed to query a few bits from the samples. We investigate the problem of testing whether a distribution is supported on m elements in this model. It turns out that the behavior of this property is surprisingly intricate, especially when also considering the question of adaptivity. We prove lower and upper bounds for both adaptive and non-adaptive algorithms in the one-sided and two-sided error regime. Our bounds are tight when m is fixed to a constant (and the distance parameter ε is the only variable). For the general case, our bounds are at most O(log m) apart. In particular, our results show a surprising O(log ε^{-1}) gap between the number of queries required for non-adaptive testing as compared to adaptive testing. For one-sided error testing, we also show that an O(log m) gap between the number of samples and the number of queries is necessary. Our results utilize a wide variety of combinatorial and probabilistic methods.

Subject Classification

ACM Subject Classification
  • Theory of computation → Streaming, sublinear and near linear time algorithms
Keywords
  • Huge-Object model
  • Property Testing

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Tomer Adar and Eldar Fischer. Refining the adaptivity notion in the huge object model. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2024, August 28-30, 2024, London, United Kingdom, volume 317. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024. Google Scholar
  2. Noga Alon. On bipartite coverings of graphs and multigraphs. arXiv preprint, 2023. URL: https://arxiv.org/abs/2307.16784.
  3. Tugkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pages 442-451. IEEE, 2001. Google Scholar
  4. Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D Smith, and Patrick White. Testing that distributions are close. In Proceedings 41st Annual Symposium on Foundations of Computer Science, pages 259-269. IEEE, 2000. Google Scholar
  5. Omri Ben-Eliezer, Eldar Fischer, Amit Levi, and Ron D Rothblum. Hard properties with (very) short pcpps and their applications. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020), 2020. Google Scholar
  6. Oded Goldreich. Introduction to property testing. Cambridge University Press, 2017. Google Scholar
  7. Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. Journal of the ACM, 45(4):653-750, 1998. Google Scholar
  8. Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Studies in Complexity and Cryptography. Miscellanea on the Interplay between Randomness and Computation, pages 68-75, 2011. Google Scholar
  9. Oded Goldreich and Dana Ron. Testing distributions of huge objects. TheoretiCS, 2, 2023. Google Scholar
  10. Georges Hansel. Nombre minimal de contacts de fermeture nécessaires pour réaliser unefonction booléenne symétrique de n variables. COMPTES RENDUS HEBDOMADAIRES DES SEANCES DE L ACADEMIE DES SCIENCES, 258(25):6037, 1964. Google Scholar
  11. Gyula Katona and Endre Szemerédi. On a problem of graph theory. Studia Scientiarum Mathematicarum Hungarica, 2:2328, 1967. Google Scholar
  12. Ronitt Rubinfeld and Madhu Sudan. Robust characterization of polynomials with applications to program testing. SIAM Journal on Computing, 25(2):252-271, 1996. Google Scholar
  13. Gregory Valiant and Paul Valiant. Estimating the unseen: an n/log (n)-sample estimator for entropy and support size, shown optimal via new clts. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 685-694, 2011. Google Scholar
  14. Gregory Valiant and Paul Valiant. Estimating the unseen: improved estimators for entropy and other properties. Journal of the ACM (JACM), 64(6):1-41, 2017. Google Scholar
  15. Yihong Wu and Pengkun Yang. Chebyshev polynomials, moment matching, and optimal estimation of the unseen. The Annals of Statistics, 47(2):857-883, 2019. Google Scholar
  16. Andrew Chi-Chin Yao. Probabilistic computations: Toward a unified measure of complexity. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science, pages 222-227, 1977. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail