Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges

Authors Yufei Tao, Yu Wang



PDF
Thumbnail PDF

File

LIPIcs.SoCG.2019.57.pdf
  • Filesize: 0.53 MB
  • 14 pages

Document Identifiers

Author Details

Yufei Tao
  • Chinese University of Hong Kong, Hong Kong
Yu Wang
  • Chinese University of Hong Kong, Hong Kong

Cite AsGet BibTex

Yufei Tao and Yu Wang. Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges. In 35th International Symposium on Computational Geometry (SoCG 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 129, pp. 57:1-57:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.SoCG.2019.57

Abstract

A family R of ranges and a set X of points, all in R^d, together define a range space (X, R|_X), where R|_X = {X cap h | h in R}. We want to find a structure to estimate the quantity |X cap h|/|X| for any range h in R with the (rho, epsilon)-guarantee: (i) if |X cap h|/|X| > rho, the estimate must have a relative error epsilon; (ii) otherwise, the estimate must have an absolute error rho epsilon. The objective is to minimize the size of the structure. Currently, the dominant solution is to compute a relative (rho, epsilon)-approximation, which is a subset of X with O~(lambda/(rho epsilon^2)) points, where lambda is the VC-dimension of (X, R|_X), and O~ hides polylog factors. This paper shows a more general bound sensitive to the content of X. We give a structure that stores O(log (1/rho)) integers plus O~(theta * (lambda/epsilon^2)) points of X, where theta - called the disagreement coefficient - measures how much the ranges differ from each other in their intersections with X. The value of theta is between 1 and 1/rho, such that our space bound is never worse than that of relative (rho, epsilon)-approximations, but we improve the latter’s 1/rho term whenever theta = o(1/(rho log (1/rho))). We also prove that, in the worst case, summaries with the (rho, 1/2)-guarantee must consume Omega(theta) words even for d = 2 and lambda <=3. We then constrain R to be the set of halfspaces in R^d for a constant d, and prove the existence of structures with o(1/(rho epsilon^2)) size offering (rho,epsilon)-guarantees, when X is generated from various stochastic distributions. This is the first formal justification on why the term 1/rho is not compulsory for "realistic" inputs.

Subject Classification

ACM Subject Classification
  • Theory of computation → Computational geometry
Keywords
  • Relative Approximation
  • Disagreement Coefficient
  • Data Summary

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Kenneth S. Alexander. Rates of Growth and Sample Moduli for Weighted Empirical Processes Indexed by Sets. Probability Theory and Related Fields, 75:379-423, 1987. Google Scholar
  2. Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM (JACM), 36(4):929-965, 1989. Google Scholar
  3. Hervé Brönnimann, Bernard Chazelle, and Jiří Matoušek. Product Range Spaces, Sensitive Sampling, and Derandomization. SIAM Journal on Computing, 28(5):1552-1575, 1999. Google Scholar
  4. Timothy M. Chan. Optimal Partition Trees. Discrete & Computational Geometry, 47(4):661-690, 2012. Google Scholar
  5. Ran El-Yaniv and Yair Wiener. Active Learning via Perfect Selective Classification. Journal of Machine Learning Research, 13:255-279, 2012. Google Scholar
  6. Esther Ezra. Small-size relative (p, ε)-approximations for well-behaved range spaces. In Proceedings of Symposium on Computational Geometry (SoCG), pages 233-242, 2013. Google Scholar
  7. Wayne A. Fuller. Sampling Statistics. Wiley, 2009. Google Scholar
  8. Steve Hanneke. A bound on the label complexity of agnostic active learning. In Proceedings of International Conference on Machine Learning (ICML), pages 353-360, 2007. Google Scholar
  9. Steve Hanneke. Theory of Active Learning, 2014. Manuscript downloadable at URL: http://www.stevehanneke.com.
  10. Sariel Har-Peled, Haim Kaplan, Micha Sharir, and Shakhar Smorodinsky. Epsilon-Nets for Halfspaces Revisited. CoRR, abs/1410.3154, 2014. URL: http://arxiv.org/abs/1410.3154.
  11. Sariel Har-Peled and Micha Sharir. Relative (p, ε)-Approximations in Geometry. Discrete & Computational Geometry, 45(3):462-496, 2011. Google Scholar
  12. David Haussler. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications. Inf. Comput., 100(1):78-150, 1992. Google Scholar
  13. David Haussler and Emo Welzl. Epsilon-Nets and Simplex Range Queries. Discrete & Computational Geometry, 2:127-151, 1987. Google Scholar
  14. János Komlós, János Pach, and Gerhard J. Woeginger. Almost Tight Bounds for epsilon-Nets. Discrete & Computational Geometry, 7:163-173, 1992. Google Scholar
  15. Andrey Kupavskii, Nabil H. Mustafa, and János Pach. New Lower Bounds for epsilon-Nets. In Proceedings of Symposium on Computational Geometry (SoCG), pages 54:1-54:16, 2016. Google Scholar
  16. Andrey Kupavskii and Nikita Zhivotovskiy. When are epsilon-nets small? CoRR, abs/1711.10414, 2017. URL: http://arxiv.org/abs/1711.10414.
  17. Yi Li, Philip M. Long, and Aravind Srinivasan. Improved Bounds on the Sample Complexity of Learning. Journal of Computer and System Sciences (JCSS), 62(3):516-527, 2001. Google Scholar
  18. Jiří Matoušek. Efficient Partition Trees. Discrete & Computational Geometry, 8:315-334, 1992. Google Scholar
  19. Jiří Matoušek, Raimund Seidel, and Emo Welzl. How to Net a Lot with Little: Small epsilon-Nets for Disks and Halfspaces. In Proceedings of Symposium on Computational Geometry (SoCG), pages 16-22, 1990. Google Scholar
  20. János Pach and Gábor Tardos. Tight lower bounds for the size of epsilon-nets. In Proceedings of Symposium on Computational Geometry (SoCG), pages 458-463, 2011. Google Scholar
  21. D. Pollard. Rates of uniform almost-sure convergence for empirical processes indexed by unbounded classes of functions. Manuscript, 1986. Google Scholar
  22. Liwei Wang. Smoothness, Disagreement Coefficient, and the Label Complexity of Agnostic Active Learning. Journal of Machine Learning Research, 12:2269-2292, 2011. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail