Categorical Range Reporting with Frequencies

Authors Arnab Ganguly, J. Ian Munro, Yakov Nekrich, Rahul Shah, Sharma V. Thankachan



PDF
Thumbnail PDF

File

LIPIcs.ICDT.2019.9.pdf
  • Filesize: 0.8 MB
  • 19 pages

Document Identifiers

Author Details

Arnab Ganguly
  • Dept. of Computer Science, University of Wisconsin, Whitewater, USA
J. Ian Munro
  • Cheriton School of Computer Science, University of Waterloo, Canada
Yakov Nekrich
  • Cheriton School of Computer Science, University of Waterloo, Canada
Rahul Shah
  • Dept. of Computer Science, Baton Rouge, USA
Sharma V. Thankachan
  • Dept. of Computer Science, University of Central Florida

Cite AsGet BibTex

Arnab Ganguly, J. Ian Munro, Yakov Nekrich, Rahul Shah, and Sharma V. Thankachan. Categorical Range Reporting with Frequencies. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 9:1-9:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.ICDT.2019.9

Abstract

In this paper, we consider a variant of the color range reporting problem called color reporting with frequencies. Our goal is to pre-process a set of colored points into a data structure, so that given a query range Q, we can report all colors that appear in Q, along with their respective frequencies. In other words, for each reported color, we also output the number of times it occurs in Q. We describe an external-memory data structure that uses O(N(1+log^2D/log N)) words and answers one-dimensional queries in O(1 +K/B) I/Os, where N is the total number of points in the data structure, D is the total number of colors in the data structure, K is the number of reported colors, and B is the block size. Next we turn to an approximate version of this problem: report all colors sigma that appear in the query range; for every reported color, we provide a constant-factor approximation on its frequency. We consider color reporting with approximate frequencies in two dimensions. Our data structure uses O(N) space and answers two-dimensional queries in O(log_B N +log^*B + K/B) I/Os in the special case when the query range is bounded on two sides. As a corollary, we can also answer one-dimensional approximate queries within the same time and space bounds.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
Keywords
  • Data Structures
  • Range Reporting
  • Range Counting
  • Categorical Range Reporting
  • Orthogonal Range Query

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Peyman Afshani. On Dominance Reporting in 3D. In ESA, pages 41-51, 2008. Google Scholar
  2. Pankaj K. Agarwal, Alon Efrat, and Micha Sharir. Vertical Decomposition of Shallow Levels in 3-Dimensional Arrangements and Its Applications. SIAM J. Comput., 29(3):912-953, 1999. URL: http://dx.doi.org/10.1137/S0097539795295936.
  3. Lars Arge, Vasilis Samoladas, and Jeffrey Scott Vitter. On Two-Dimensional Indexability and Optimal Range Search Indexing. In Proc. 18th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), pages 346-357, 1999. URL: http://dx.doi.org/10.1145/303976.304010.
  4. Lars Arge, Vasilis Samoladas, and Ke Yi. Optimal External Memory Planar Point Enclosure. Algorithmica, 54(3):337-352, 2009. URL: http://dx.doi.org/10.1007/s00453-007-9126-2.
  5. Lars Arge and Jeffrey Scott Vitter. Optimal External Memory Interval Management. SIAM J. Comput., 32(6):1488-1508, 2003. URL: http://dx.doi.org/10.1137/S009753970240481X.
  6. Bruno Becker, Stephan Gschwind, Thomas Ohler, Bernhard Seeger, and Peter Widmayer. An Asymptotically Optimal Multiversion B-Tree. VLDB J., 5(4):264-275, 1996. URL: http://dx.doi.org/10.1007/s007780050028.
  7. Panayiotis Bozanis, Nectarios Kitsios, Christos Makris, and Athanasios K. Tsakalidis. New Upper Bounds for Generalized Intersection Searching Problems. In Proc. 22nd International Colloquium on Automata, Languages and Programming (ICALP), pages 464-474, 1995. URL: http://dx.doi.org/10.1007/3-540-60084-1_97.
  8. Panayiotis Bozanis, Nectarios Kitsios, Christos Makris, and Athanasios K. Tsakalidis. New Results on Intersection Query Problems. Comput. J., 40(1):22-29, 1997. URL: http://dx.doi.org/10.1093/comjnl/40.1.22.
  9. Timothy M. Chan and Bryan T. Wilkinson. Adaptive and Approximate Orthogonal Range Counting. In Proc. 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 241-251, 2013. URL: http://dx.doi.org/10.1137/1.9781611973105.18.
  10. Arnab Ganguly, Wing-Kai Hon, and Rahul Shah. Stabbing Colors in One Dimension. In 2017 Data Compression Conference, DCC 2017, Snowbird, UT, USA, April 4-7, 2017, pages 280-289, 2017. URL: http://dx.doi.org/10.1109/DCC.2017.44.
  11. Prosenjit Gupta, Ravi Janardan, and Michiel H. M. Smid. Further Results on Generalized Intersection Searching Problems: counting, Reporting, and Dynamization. Journal of Algorithms, 19(2):282-317, 1995. URL: http://dx.doi.org/10.1006/jagm.1995.1038.
  12. Prosenjit Gupta, Ravi Janardan, and Michiel H. M. Smid. Algorithms for Generalized Halfspace Range Searching and Other Intersection Searching Problems. Comput. Geom., 6:1-19, 1996. URL: http://dx.doi.org/10.1016/0925-7721(95)00012-7.
  13. Ravi Janardan and Mario A. Lopez. Generalized intersection searching problems. International Journal of Computational Geometry and Applications, 3(1):39-69, 1993. Google Scholar
  14. Marek Karpinski and Yakov Nekrich. Top-K Color Queries for Document Retrieval. In Proc. 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 401-411, 2011. URL: http://www.siam.org/proceedings/soda/2011/SODA11_032_karpinskim.pdf.
  15. Kasper Green Larsen and Rasmus Pagh. I/O-efficient data structures for colored range and prefix reporting. In Proc. 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 583-592, 2012. URL: http://portal.acm.org/citation.cfm?id=2095165&CFID=63838676&CFTOKEN=79617016.
  16. Kasper Green Larsen and Freek van Walderveen. Near-Optimal Range Reporting Structures for Categorical Data. In Proc. 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 256-276, 2013. Google Scholar
  17. Jiří Matoušek. Reporting Points in Halfspaces. In Proc. 32nd Annual Symposium on Foundations of Computer Science (FOCS), pages 207-215, 1991. URL: http://dx.doi.org/10.1109/SFCS.1991.185370.
  18. S. Muthukrishnan. Efficient algorithms for document retrieval problems. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 657-666, 2002. URL: http://dx.doi.org/10.1145/545381.545469.
  19. Alexandros Nanopoulos and Panayiotis Bozanis. Categorical Range Queires in Large Databases. In Proc. 8th International Symposium on Advances in Spatial and Temporal Databases, (SSTD), pages 122-139, 2003. URL: http://dx.doi.org/10.1007/978-3-540-45072-6_8.
  20. Gonzalo Navarro and Yakov Nekrich. Top-k document retrieval in optimal time and linear space. In Proc. 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1066-1077, 2012. URL: http://portal.acm.org/citation.cfm?id=2095200&CFID=63838676&CFTOKEN=79617016.
  21. Yakov Nekrich. External Memory Range Reporting on a Grid. In Proc. 18th International Symposium on Algorithms and Computation (ISAAC), pages 525-535, 2007. URL: http://dx.doi.org/10.1007/978-3-540-77120-3_46.
  22. Yakov Nekrich. Data Structures for Approximate Orthogonal Range Counting. In Proc. 20th International Symposium on Algorithms and Computation (ISAAC), pages 183-192, 2009. URL: http://dx.doi.org/10.1007/978-3-642-10631-6_20.
  23. Yakov Nekrich. Efficient range searching for categorical and plain data. ACM Trans. Database Syst., 39(1):9, 2014. URL: http://dx.doi.org/10.1145/2543924.
  24. Yakov Nekrich and Jeffrey Scott Vitter. Optimal Color Range Reporting in One Dimension. In Proc. 21st Annual European Symposium Algorithms (ESA), pages 743-754, 2013. URL: http://dx.doi.org/10.1007/978-3-642-40450-4_63.
  25. Manish Patil, Sharma V. Thankachan, Rahul Shah, Yakov Nekrich, and Jeffrey Scott Vitter. Categorical range maxima queries. In Proc. 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 266-277, 2014. URL: http://dx.doi.org/10.1145/2594538.2594557.
  26. Mihai Patrascu. Succincter. In Proc. 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 305-313, 2008. URL: http://dx.doi.org/10.1109/FOCS.2008.83.
  27. Saladi Rahul. Improved Bounds for Orthogonal Point Enclosure Query and Point Location in Orthogonal Subdivisions in ℝ^3. In Proc.26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 200-211, 2015. URL: http://dx.doi.org/10.1137/1.9781611973730.15.
  28. Saladi Rahul. Approximate Range Counting Revisited. In Proc. 33rd International Symposium on Computational Geometry (SoCG), pages 55:1-55:15, 2017. URL: http://dx.doi.org/10.4230/LIPIcs.SoCG.2017.55.
  29. Kunihiko Sadakane. Succinct data structures for flexible text retrieval systems. J. Discrete Algorithms, 5(1):12-22, 2007. URL: http://dx.doi.org/10.1016/j.jda.2006.03.011.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail