Document Open Access Logo

Practical Low-Dimensional Halfspace Range Space Sampling

Authors Michael Matheny, Jeff M. Phillips



PDF
Thumbnail PDF

File

LIPIcs.ESA.2018.62.pdf
  • Filesize: 0.62 MB
  • 14 pages

Document Identifiers

Author Details

Michael Matheny
  • University of Utah, USA
Jeff M. Phillips
  • University of Utah, USA

Cite AsGet BibTex

Michael Matheny and Jeff M. Phillips. Practical Low-Dimensional Halfspace Range Space Sampling. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 62:1-62:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ESA.2018.62

Abstract

We develop, analyze, implement, and compare new algorithms for creating epsilon-samples of range spaces defined by halfspaces which have size sub-quadratic in 1/epsilon, and have runtime linear in the input size and near-quadratic in 1/epsilon. The key to our solution is an efficient construction of partition trees. Despite not requiring any techniques developed after the early 1990s, apparently such a result was never explicitly described. We demonstrate that our implementations, including new implementations of several variants of partition trees, do indeed run in time linear in the input, appear to run linear in output size, and observe smaller error for the same size sample compared to the ubiquitous random sample (which requires size quadratic in 1/epsilon). This result has direct applications in speeding up discrepancy evaluation, approximate range counting, and spatial anomaly detection.

Subject Classification

ACM Subject Classification
  • Theory of computation → Computational geometry
Keywords
  • Partitions
  • Range Spaces
  • Sampling
  • Halfspaces

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Crimes in Chicago. https://www.kaggle.com/currie32/crimes-in-chicago, 2017.
  2. Huseyin Akcan, Herve Bronnimann, and Robert Marini. Practical and efficient geometric ε-approximations. Proceedings of the 18th Canadian Conference on Computational Geometry, pages 120-125, 2006. Google Scholar
  3. J. Ralph Alexander. Geometric methods in thge theory of uniform distribution. Combinatorica, 10:115-136, 1990. Google Scholar
  4. Amitabha Bagchi, Amitabh Chaudhary, David Eppstein, and Michael T. Goodrich. Deterministic sampling and range counting in geometric data streams. ACM Transactions on Algorithms, 3(A16), 2007. Google Scholar
  5. Nikhil Bansal. Constructive algorithms for discrepancy minimization. In Proceedings 51st Annual IEEE Symposium on Foundations of Computer Science, pages 407-414, 2010. Google Scholar
  6. Christopher Barker. Pep 485 - a function for testing approximate equality. https://www.python.org/dev/peps/pep-0485/, Jan 2015.
  7. Timothy M. Chan. Optimal partition trees. In In: Proc. 26th Annu. ACM Sympos. Comput. Geom, pages 1-10, 2010. Google Scholar
  8. Bernard Chazelle. The Discrepancy Method. Cambridge, 2000. Google Scholar
  9. Bernard Chazelle and Joel Friedman. A deterministic view of random sampling and its use in geometry. Combinatorica, 10:229-249, 1990. Google Scholar
  10. Bernard Chazelle and Jiri Matousek. On linear-time deterministic algorithms for optimization problems in fixed dimensions. Journal of Algorithms, 21:579-597, 1996. Google Scholar
  11. Herbert Edelsbrunner and Emo Welzl. Halfplanar range search in linear space and o(n^0.695) query time. In 23, editor, Information Processing Letters, pages 289-293, 1986. Google Scholar
  12. Pavlos S. Efraimidis and Paul G. Spirakis. Weighted random sampling with a reservoir. Information Processing Letters, 97(5):181-185, 2006. Google Scholar
  13. S. Har-Peled. Constructing planar cuttings in theory and practice. SIAM J. Comput., 29(6):2016-2039, 2000. Google Scholar
  14. Martin Kulldorff. A spatial scan statistic. Communications in Statistics: Theory and Methods, 26:1481-1496, 1997. Google Scholar
  15. Yi Li, Philip M. Long, and Aravind Srinivasan. Improved bounds on the samples complexity of learning. J. Comp. and Sys. Sci., 62:516-527, 2001. Google Scholar
  16. Sachar Lovett and Raghu Meka. Constructive discrepancy minimization by walking on the edges. SIAM Journal on Computing, 44:1573-1582, 2015. Google Scholar
  17. Michael Matheny and Jeff M. Phillips. Computing approximate statistical discrepancy. CoRR, abs/1804.11287, 2018. URL: http://arxiv.org/abs/1804.11287.
  18. Michael Matheny, Raghvendra Singh, Liang Zhang, Kaiqiang Wang, and Jeff M. Phillips. Scalable spatial scan statistics through sampling. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, 2016. Google Scholar
  19. Jiri Matoušek. Approximations and optimal geometric divide-and-conquer. In Proceedings 23rd Symposium on Theory of Computing, pages 505-511, 1991. Google Scholar
  20. Jiri Matoušek. Efficient partition trees. Discrete &Computational Geometry, 8:315-334, 1992. Google Scholar
  21. Jiri Matoušek. Tight upper bounds for the discrepancy of halfspaces. Discrete and Computational Geometry, 13:593-601, 1995. Google Scholar
  22. Jiri Matoušek. Geometric Discrepancy. Springer, 2009. Google Scholar
  23. Jiří Matoušek, Chi-Yuan Lo, and William Steiger. Ham-sandwich cuts in rd. In Proceedings of the Twenty-fourth Annual ACM Symposium on Theory of Computing, STOC '92, pages 539-545, New York, NY, USA, 1992. ACM. Google Scholar
  24. Nimrod Megiddo. Partitioning with two lines in the plane. Journal of Algorithms, 6(3):430-433, 1985. Google Scholar
  25. Raimund Seidel. A simple and fast incremental randomized algorithm for computing trapezoidal decompositions and for triangulating polygons. Computational Geometry, 1:51-64, 1991. Google Scholar
  26. Subhash Suri, Csaba D. Tóth, and Yunhong Zhou. Range counting over multidimensional data streams. In Proceedings 20th Symposium on Computational Geometry, pages 160-169, 2004. Google Scholar
  27. Vladimir Vapnik and Alexey Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theo. of Prob and App, 16:264-280, 1971. Google Scholar
  28. Ron Wein, Eric Berberich, Efi Fogel, Dan Halperin, Michael Hemmer, Oren Salzman, and Baruch Zukerman. 2D arrangements. In CGAL User and Reference Manual. CGAL Editorial Board, 4.12 edition, 2018. Google Scholar
  29. D. E. Willard. Polygon retrieval. In 11, editor, SIAM Journal of Computing, pages 149-165, 1982. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail