O'Reach: Even Faster Reachability in Large Graphs

Authors Kathrin Hanauer , Christian Schulz , Jonathan Trummer



PDF
Thumbnail PDF

File

LIPIcs.SEA.2021.13.pdf
  • Filesize: 2.12 MB
  • 24 pages

Document Identifiers

Author Details

Kathrin Hanauer
  • University of Vienna, Faculty of Computer Science, Austria
Christian Schulz
  • Heidelberg University, Germany
Jonathan Trummer
  • University of Vienna, Faculty of Computer Science, Austria

Cite AsGet BibTex

Kathrin Hanauer, Christian Schulz, and Jonathan Trummer. O'Reach: Even Faster Reachability in Large Graphs. In 19th International Symposium on Experimental Algorithms (SEA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 190, pp. 13:1-13:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.SEA.2021.13

Abstract

One of the most fundamental problems in computer science is the reachability problem: Given a directed graph and two vertices s and t, can s reach t via a path? We revisit existing techniques and combine them with new approaches to support a large portion of reachability queries in constant time using a linear-sized reachability index. Our new algorithm O'Reach can be easily combined with previously developed solutions for the problem or run standalone. In a detailed experimental study, we compare a variety of algorithms with respect to their index-building and query times as well as their memory footprint on a diverse set of instances. Our experiments indicate that the query performance often depends strongly not only on the type of graph, but also on the result, i.e., reachable or unreachable. Furthermore, we show that previous algorithms are significantly sped up when combined with our new approach in almost all scenarios. Surprisingly, due to cache effects, a higher investment in space doesn't necessarily pay off: Reachability queries can often be answered even faster than single memory accesses in a precomputed full reachability matrix.

Subject Classification

ACM Subject Classification
  • Theory of computation → Graph algorithms analysis
  • Mathematics of computing → Graph algorithms
Keywords
  • Reachability
  • Static Graphs
  • Graph Algorithms
  • Reachability Index
  • Algorithm Engineering

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. D. Bader, A. Kappes, H. Meyerhenke, P. Sanders, C. Schulz, and D. Wagner. Benchmarking for Graph Clustering and Partitioning. In Encyclopedia of Social Network Analysis and Mining. Springer, 2014. Google Scholar
  2. Yangjun Chen and Yibin Chen. An efficient algorithm for answering graph reachability queries. In Gustavo Alonso, José A. Blakeley, and Arbee L. P. Chen, editors, Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7-12, 2008, Cancún, Mexico, pages 893-902. IEEE Computer Society, 2008. URL: https://doi.org/10.1109/ICDE.2008.4497498.
  3. James Cheng, Silu Huang, Huanhuan Wu, and Ada Wai-Chee Fu. TF-label: a topological-folding labeling scheme for reachability querying in a large graph. In Kenneth A. Ross, Divesh Srivastava, and Dimitris Papadias, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013, pages 193-204. ACM, 2013. URL: https://doi.org/10.1145/2463676.2465286.
  4. Jiefeng Cheng, Jeffrey Xu Yu, Xuemin Lin, Haixun Wang, and Philip S. Yu. Fast computation of reachability labeling for large graphs. In Yannis E. Ioannidis, Marc H. Scholl, Joachim W. Schmidt, Florian Matthes, Michael Hatzopoulos, Klemens Böhm, Alfons Kemper, Torsten Grust, and Christian Böhm, editors, Advances in Database Technology - EDBT 2006, 10th International Conference on Extending Database Technology, Munich, Germany, March 26-31, 2006, Proceedings, volume 3896 of Lecture Notes in Computer Science, pages 961-979. Springer, 2006. URL: https://doi.org/10.1007/11687238_56.
  5. Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338-1355, 2003. URL: https://doi.org/10.1137/S0097539702403098.
  6. T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms, 2009. Google Scholar
  7. R. W. Floyd. Algorithm 97: Shortest path. Commun. ACM, 5(6):345, June 1962. URL: https://doi.org/10.1145/367766.368168.
  8. Daniel Funke, Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, and Moritz von Looz. Communication-free massively distributed graph generation. In 2018 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018, Vancouver, BC, Canada, May 21 - May 25, 2018, 2018. Google Scholar
  9. Robert Geisberger, Peter Sanders, Dominik Schultes, and Daniel Delling. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In International Workshop on Experimental and Efficient Algorithms, pages 319-333. Springer, 2008. Google Scholar
  10. Robert Geisberger, Peter Sanders, Dominik Schultes, and Christian Vetter. Exact routing in large road networks using contraction hierarchies. Transportation Science, 46(3):388-404, 2012. Google Scholar
  11. Kathrin Hanauer, Monika Henzinger, and Christian Schulz. Faster Fully Dynamic Transitive Closure in Practice. In Simone Faro and Domenico Cantone, editors, 18th International Symposium on Experimental Algorithms (SEA 2020), volume 160 of Leibniz International Proceedings in Informatics (LIPIcs), pages 14:1-14:14, Dagstuhl, Germany, 2020. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. URL: https://doi.org/10.4230/LIPIcs.SEA.2020.14.
  12. Kathrin Hanauer, Christian Schulz, and Jonathan Trummer. O'Reach: Even faster reachability in static graphs. CoRR, abs/2008.10932, 2021. URL: http://arxiv.org/abs/2008.10932.
  13. H. V. Jagadish. A compression technique to materialize transitive closure. ACM Trans. Database Syst., 15(4):558-598, 1990. URL: https://doi.org/10.1145/99935.99944.
  14. Ruoming Jin, Ning Ruan, Saikat Dey, and Jeffrey Xu Yu. SCARAB: scaling reachability computation on large graphs. In K. Selçuk Candan, Yi Chen, Richard T. Snodgrass, Luis Gravano, and Ariel Fuxman, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 169-180. ACM, 2012. URL: https://doi.org/10.1145/2213836.2213856.
  15. Ruoming Jin, Ning Ruan, Yang Xiang, and Haixun Wang. Path-tree: An efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst., 36(1):7:1-7:44, 2011. URL: https://doi.org/10.1145/1929934.1929941.
  16. Ruoming Jin and Guan Wang. Simple, fast, and scalable reachability oracle. Proc. VLDB Endow., 6(14):1978-1989, 2013. URL: https://doi.org/10.14778/2556549.2556578.
  17. Ruoming Jin, Yang Xiang, Ning Ruan, and David Fuhry. 3-hop: A high-compression indexing scheme for reachability query. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD ’09, page 813–826, New York, NY, USA, 2009. Association for Computing Machinery. URL: https://doi.org/10.1145/1559845.1559930.
  18. Ruoming Jin, Yang Xiang, Ning Ruan, and Haixun Wang. Efficiently answering reachability queries on very large directed graphs. In Jason Tsong-Li Wang, editor, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10-12, 2008, pages 595-608. ACM, 2008. URL: https://doi.org/10.1145/1376616.1376677.
  19. A. B. Kahn. Topological sorting of large networks. Commun. ACM, 5(11):558–562, 1962. URL: https://doi.org/10.1145/368996.369025.
  20. F. Le Gall. Powers of tensors and fast matrix multiplication. In K. Nabeshima, K. Nagasaka, F. Winkler, and Á. Szántó, editors, International Symposium on Symbolic and Algebraic Computation, ISSAC '14, Kobe, Japan, July 23-25, 2014, pages 296-303. ACM, 2014. URL: https://doi.org/10.1145/2608628.2608664.
  21. Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, 2014.
  22. F. Merz and P. Sanders. Preach: A fast lightweight reachability index using pruning and contraction hierarchies. In A. S. Schulz and D. Wagner, editors, European Symposium on Algorithms, pages 701-712, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg. Google Scholar
  23. Richard C. Murphy, Kyle B. Wheeler, Brian W. Barrett, and James A. Ang. Introducing the graph 500. Cray Users Group (CUG), 19:45-74, 2010. Google Scholar
  24. Thomas Reps. Program analysis via graph reachability. Information and software technology, 40(11-12):701-726, 1998. Google Scholar
  25. Thomas Reps, Susan Horwitz, and Mooly Sagiv. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages, pages 49-61, 1995. Google Scholar
  26. Ralf Schenkel, Anja Theobald, and Gerhard Weikum. HOPI: an efficient connection index for complex XML document collections. In Elisa Bertino, Stavros Christodoulakis, Dimitris Plexousakis, Vassilis Christophides, Manolis Koubarakis, Klemens Böhm, and Elena Ferrari, editors, Advances in Database Technology - EDBT 2004, 9th International Conference on Extending Database Technology, Heraklion, Crete, Greece, March 14-18, 2004, Proceedings, volume 2992 of Lecture Notes in Computer Science, pages 237-255. Springer, 2004. URL: https://doi.org/10.1007/978-3-540-24741-8_15.
  27. B. Scholz, C. Zhang, and C. Cifuentes. User-input dependence analysis via graph reachability. In 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation, pages 25-34, 2008. Google Scholar
  28. Jiao Su, Qing Zhu, Hao Wei, and Jeffrey Xu Yu. Reachability querying: Can it be even faster? IEEE Trans. Knowl. Data Eng., 29(3):683-697, 2017. URL: https://doi.org/10.1109/TKDE.2016.2631160.
  29. Robert Tarjan. Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2):146-160, 1972. URL: https://doi.org/10.1137/0201010.
  30. Robert Endre Tarjan. Edge-disjoint spanning trees and depth-first search. Acta Informatica, 6(2):171-185, 1976. Google Scholar
  31. Silke Trißl and Ulf Leser. Fast and practical indexing and querying of very large graphs. In Chee Yong Chan, Beng Chin Ooi, and Aoying Zhou, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12-14, 2007, pages 845-856. ACM, 2007. URL: https://doi.org/10.1145/1247480.1247573.
  32. Sebastiaan J. van Schaik and Oege de Moor. A memory efficient reachability data structure through bit vector compression. In Timos K. Sellis, Renée J. Miller, Anastasios Kementsietsidis, and Yannis Velegrakis, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12-16, 2011, pages 913-924. ACM, 2011. URL: https://doi.org/10.1145/1989323.1989419.
  33. Renê Rodrigues Veloso, Loïc Cerf, Wagner Meira, and Mohammed J. Zaki. Reachability queries in very large graphs: A fast refined online search approach. In EDBT, pages 511-522, 2014. Google Scholar
  34. Haixun Wang, Hao He, Jun Yang, Philip S. Yu, and Jeffrey Xu Yu. Dual labeling: Answering graph reachability queries in constant time. In Ling Liu, Andreas Reuter, Kyu-Young Whang, and Jianjun Zhang, editors, Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, 3-8 April 2006, Atlanta, GA, USA, page 75. IEEE Computer Society, 2006. URL: https://doi.org/10.1109/ICDE.2006.53.
  35. S. Warshall. A theorem on boolean matrices. J. ACM, 9(1):11–12, 1962. URL: https://doi.org/10.1145/321105.321107.
  36. Hao Wei, Jeffrey Xu Yu, Can Lu, and Ruoming Jin. Reachability querying: an independent permutation labeling approach. VLDB J., 27(1):1-26, 2018. URL: https://doi.org/10.1007/s00778-017-0468-3.
  37. Yosuke Yano, Takuya Akiba, Yoichi Iwata, and Yuichi Yoshida. Fast and scalable reachability queries on graphs by pruned labeling with landmarks and paths. In Qi He, Arun Iyengar, Wolfgang Nejdl, Jian Pei, and Rajeev Rastogi, editors, 22nd ACM International Conference on Information and Knowledge Management, CIKM'13, San Francisco, CA, USA, October 27 - November 1, 2013, pages 1601-1606. ACM, 2013. URL: https://doi.org/10.1145/2505515.2505724.
  38. Hilmi Yildirim, Vineet Chaoji, and Mohammed J. Zaki. Grail: Scalable reachability index for large graphs. Proc. VLDB Endow., 3(1–2):276–284, 2010. URL: https://doi.org/10.14778/1920841.1920879.
  39. Hilmi Yıldırım, Vineet Chaoji, and Mohammed J Zaki. GRAIL: a scalable index for reachability queries in very large graphs. The VLDB Journal, 21(4):509-534, 2012. Google Scholar
  40. Jeffrey Xu Yu and Jiefeng Cheng. Graph reachability queries: A survey. In Charu C. Aggarwal and Haixun Wang, editors, Managing and Mining Graph Data, volume 40 of Advances in Database Systems, pages 181-215. Springer, 2010. URL: https://doi.org/10.1007/978-1-4419-6045-0_6.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail