Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges

Tao, Yufei; Wang, Yu

doi:10.4230/LIPIcs.SoCG.2019.57

Abstract

A family R of ranges and a set X of points, all in R^d, together define a range space (X, R|_X), where R|_X = {X cap h | h in R}. We want to find a structure to estimate the quantity |X cap h|/|X| for any range h in R with the (rho, epsilon)-guarantee: (i) if |X cap h|/|X| > rho, the estimate must have a relative error epsilon; (ii) otherwise, the estimate must have an absolute error rho epsilon. The objective is to minimize the size of the structure. Currently, the dominant solution is to compute a relative (rho, epsilon)-approximation, which is a subset of X with O~(lambda/(rho epsilon^2)) points, where lambda is the VC-dimension of (X, R|_X), and O~ hides polylog factors.
This paper shows a more general bound sensitive to the content of X. We give a structure that stores O(log (1/rho)) integers plus O~(theta * (lambda/epsilon^2)) points of X, where theta - called the disagreement coefficient - measures how much the ranges differ from each other in their intersections with X. The value of theta is between 1 and 1/rho, such that our space bound is never worse than that of relative (rho, epsilon)-approximations, but we improve the latter’s 1/rho term whenever theta = o(1/(rho log (1/rho))). We also prove that, in the worst case, summaries with the (rho, 1/2)-guarantee must consume Omega(theta) words even for d = 2 and lambda <=3.
We then constrain R to be the set of halfspaces in R^d for a constant d, and prove the existence of structures with o(1/(rho epsilon^2)) size offering (rho,epsilon)-guarantees, when X is generated from various stochastic distributions. This is the first formal justification on why the term 1/rho is not compulsory for "realistic" inputs.

Cite As Get BibTex

Yufei Tao and Yu Wang. Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges. In 35th International Symposium on Computational Geometry (SoCG 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 129, pp. 57:1-57:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.SoCG.2019.57

Author Details

Yufei Tao

Chinese University of Hong Kong, Hong Kong

Yu Wang

Chinese University of Hong Kong, Hong Kong

Funding

This work was partially supported by a direct grant (Project Number: 4055079) from CUHK and by a Faculty Research Award from Google.

References

Kenneth S. Alexander. Rates of Growth and Sample Moduli for Weighted Empirical Processes Indexed by Sets. Probability Theory and Related Fields, 75:379-423, 1987.
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM (JACM), 36(4):929-965, 1989.
Hervé Brönnimann, Bernard Chazelle, and Jiří Matoušek. Product Range Spaces, Sensitive Sampling, and Derandomization. SIAM Journal on Computing, 28(5):1552-1575, 1999.
Timothy M. Chan. Optimal Partition Trees. Discrete & Computational Geometry, 47(4):661-690, 2012.
Ran El-Yaniv and Yair Wiener. Active Learning via Perfect Selective Classification. Journal of Machine Learning Research, 13:255-279, 2012.
Esther Ezra. Small-size relative (p, ε)-approximations for well-behaved range spaces. In Proceedings of Symposium on Computational Geometry (SoCG), pages 233-242, 2013.
Wayne A. Fuller. Sampling Statistics. Wiley, 2009.
Steve Hanneke. A bound on the label complexity of agnostic active learning. In Proceedings of International Conference on Machine Learning (ICML), pages 353-360, 2007.
Steve Hanneke. Theory of Active Learning, 2014. Manuscript downloadable at URL: http://www.stevehanneke.com.
Sariel Har-Peled, Haim Kaplan, Micha Sharir, and Shakhar Smorodinsky. Epsilon-Nets for Halfspaces Revisited. CoRR, abs/1410.3154, 2014. URL: http://arxiv.org/abs/1410.3154.
Sariel Har-Peled and Micha Sharir. Relative (p, ε)-Approximations in Geometry. Discrete & Computational Geometry, 45(3):462-496, 2011.
David Haussler. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications. Inf. Comput., 100(1):78-150, 1992.
David Haussler and Emo Welzl. Epsilon-Nets and Simplex Range Queries. Discrete & Computational Geometry, 2:127-151, 1987.
János Komlós, János Pach, and Gerhard J. Woeginger. Almost Tight Bounds for epsilon-Nets. Discrete & Computational Geometry, 7:163-173, 1992.
Andrey Kupavskii, Nabil H. Mustafa, and János Pach. New Lower Bounds for epsilon-Nets. In Proceedings of Symposium on Computational Geometry (SoCG), pages 54:1-54:16, 2016.
Andrey Kupavskii and Nikita Zhivotovskiy. When are epsilon-nets small? CoRR, abs/1711.10414, 2017. URL: http://arxiv.org/abs/1711.10414.
Yi Li, Philip M. Long, and Aravind Srinivasan. Improved Bounds on the Sample Complexity of Learning. Journal of Computer and System Sciences (JCSS), 62(3):516-527, 2001.
Jiří Matoušek. Efficient Partition Trees. Discrete & Computational Geometry, 8:315-334, 1992.
Jiří Matoušek, Raimund Seidel, and Emo Welzl. How to Net a Lot with Little: Small epsilon-Nets for Disks and Halfspaces. In Proceedings of Symposium on Computational Geometry (SoCG), pages 16-22, 1990.
János Pach and Gábor Tardos. Tight lower bounds for the size of epsilon-nets. In Proceedings of Symposium on Computational Geometry (SoCG), pages 458-463, 2011.
D. Pollard. Rates of uniform almost-sure convergence for empirical processes indexed by unbounded classes of functions. Manuscript, 1986.
Liwei Wang. Smoothness, Disagreement Coefficient, and the Label Complexity of Agnostic Active Learning. Journal of Machine Learning Research, 12:2269-2292, 2011.

Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges

Authors Yufei Tao, Yu Wang

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges

Authors Yufei Tao, Yu Wang

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message