Depth First Search in the Semi-streaming Model
Depth first search (DFS) tree is a fundamental data structure for solving various graph problems. The classical algorithm for building a DFS tree requires O(m+n) time for a given undirected graph G having n vertices and m edges. In the streaming model, an algorithm is allowed several passes (preferably single) over the input graph having a restriction on the size of local space used.
Now, a DFS tree of a graph can be trivially computed using a single pass if O(m) space is allowed. In the semi-streaming model allowing O(n) space, it can be computed in O(n) passes over the input stream, where each pass adds one vertex to the DFS tree. However, it remains an open problem to compute a DFS tree using o(n) passes using o(m) space even in any relaxed streaming environment.
We present the first semi-streaming algorithms that compute a DFS tree of an undirected graph in o(n) passes using o(m) space. We first describe an extremely simple algorithm that requires at most ceil[n/k] passes to compute a DFS tree using O(nk) space, where k is any positive integer. For example using k=sqrt{n}, we can compute a DFS tree in sqrt{n} passes using O(n sqrt{n}) space. We then improve this algorithm by using more involved techniques to reduce the number of passes to ceil[h/k] under similar space constraints, where h is the height of the computed DFS tree. In particular, this algorithm improves the bounds for the case where the computed DFS tree is shallow (having o(n) height). Moreover, this algorithm is presented in form of a framework that allows the flexibility of using any algorithm to maintain a DFS tree of a stored sparser subgraph as a black box, which may be of an independent interest. Both these algorithms essentially demonstrate the existence of a trade-off between the space and number of passes required for computing a DFS tree. Furthermore, we evaluate these algorithms experimentally which reveals their exceptional performance in practice. For both random and real graphs, they require merely a few passes even when allowed just O(n) space.
Depth First Search
DFS
Semi-Streaming
Streaming
Algorithm
Theory of computation~Graph algorithms analysis
Theory of computation~Data structures design and analysis
Theory of computation~Streaming, sublinear and near linear time algorithms
42:1-42:16
Regular Paper
Full version of the papers is available at https://arxiv.org/abs/1901.03689.
Shahbaz
Khan
Shahbaz Khan
Faculty of Computer Science, University of Vienna, Austria
https://orcid.org/0000-0001-9352-0088
This research work was supported by the European Research Council under the European Unionâ€™s Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement no. 340506.
Shashank
K. Mehta
Shashank K. Mehta
Dept. of Computer Science and Engineering, Indian Institute of Technology Kanpur, India
https://orcid.org/0000-0003-1830-2077
10.4230/LIPIcs.STACS.2019.42
Kook Jin Ahn and Sudipto Guha. Linear programming in the semi-streaming model with application to the maximum matching problem. Inf. Comput., 222:59-79, 2013.
Noga Alon, Yossi Matias, and Mario Szegedy. The Space Complexity of Approximating the Frequency Moments. Journal of Computer and System Sciences, 58(1):137-147, 1999.
Giorgio Ausiello, Donatella Firmani, and Luigi Laura. Datastream computation of graph biconnectivity: Articulation Points, Bridges, and Biconnected Components. In Theoretical Computer Science, 11th Italian Conference, ICTCS 2009, Cremona, Italy, September 28-30, 2009, Proceedings., pages 26-29, 2009.
Giorgio Ausiello, Donatella Firmani, and Luigi Laura. Real-time monitoring of undirected networks: Articulation points, bridges, and connected and biconnected components. Networks, 59(3):275-288, 2012. URL: http://dx.doi.org/10.1002/net.21450.
http://dx.doi.org/10.1002/net.21450
Ziv Bar-Yossef, Ravi Kumar, and D. Sivakumar. Reductions in Streaming Algorithms, with an Application to Counting Triangles in Graphs. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '02, pages 623-632, 2002.
Surender Baswana. Streaming algorithm for graph spanners - single pass and constant processing time per edge. Inf. Process. Lett., 106(3):110-114, 2008.
Surender Baswana, Shreejit Ray Chaudhury, Keerti Choudhary, and Shahbaz Khan. Dynamic DFS in Undirected Graphs: breaking the O(m) barrier. In ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 730-739, 2016.
Surender Baswana and Shahbaz Khan. Incremental Algorithm for Maintaining a DFS Tree for Undirected Graphs. Algorithmica, 79(2):466-483, 2017.
Glencora Borradaile, Claire Mathieu, and Theresa Migler. Lower bounds for testing digraph connectivity with one-pass streaming algorithms. CoRR, abs/1404.1323, 2014.
Adam L. Buchsbaum, Raffaele Giancarlo, and Jeffery Westbrook. On finding common neighborhoods in massive graphs. Theor. Comput. Sci., 299(1-3):707-718, 2003.
Michael Elkin. Streaming and fully dynamic centralized algorithms for constructing and maintaining sparse spanners. ACM Trans. Algorithms, 7(2):20:1-20:17, 2011.
Michael Elkin. Distributed exact shortest paths in sublinear time. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 757-770, 2017.
Shimon Even and Robert Endre Tarjan. Network Flow and Testing Graph Connectivity. SIAM J. Comput., 4(4):507-518, 1975.
Martin Farach-Colton, Tsan-sheng Hsu, Meng Li, and Meng-Tsung Tsai. Finding Articulation Points of Large Graphs in Linear Time. In Algorithms and Data Structures, WADS, pages 363-372, Cham, 2015. Springer International Publishing.
Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang. Graph Distances in the Streaming Model: The Value of Space. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '05, pages 745-754, 2005.
Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang. On Graph Problems in a Semi-streaming Model. Theor. Comput. Sci., 348(2):207-216, December 2005.
Joan Feigenbaum, Sampath Kannan, Martin J. Strauss, and Mahesh Viswanathan. An Approximate L1-Difference Algorithm for Massive Data Streams. SIAM J. Comput., 32(1):131-151, January 2003.
Philippe Flajolet and G. Nigel Martin. Probabilistic counting algorithms for data base applications. Journal of Computer and System Sciences, 31(2):182-209, 1985.
Sudipto Guha, Nick Koudas, and Kyuseok Shim. Data-streams and Histograms. In Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing, STOC '01, pages 471-475, 2001.
Venkatesan Guruswami and Krzysztof Onak. Superlinear Lower Bounds for Multipass Graph Processing. Algorithmica, 76(3):654-683, November 2016.
Venkatesan Guruswami and Krzysztof Onak. Superlinear Lower Bounds for Multipass Graph Processing. Algorithmica, 76(3):654-683, 2016.
Monika Rauch Henzinger, Prabhakar Raghavan, and Sridhar Rajagopalan. Computing on data streams. In External Memory Algorithms, Proceedings of a DIMACS Workshop, New Brunswick, New Jersey, USA, May 20-22, 1998, pages 107-118, 1998.
John E. Hopcroft and Richard M. Karp. An n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs. SIAM J. Comput., 2(4):225-231, 1973.
John E. Hopcroft and Robert Endre Tarjan. Efficient Planarity Testing. J. ACM, 21(4):549-568, 1974.
Piotr Indyk. Stable Distributions, Pseudorandom Generators, Embeddings, and Data Stream Computation. J. ACM, 53(3):307-323, May 2006.
Sagar Kale and Sumedh Tirodkar. Maximum Matching in Two, Three, and a Few More Passes Over Graph Streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berkeley, CA, USA, pages 15:1-15:21, 2017.
Michael Kapralov. Better bounds for matchings in the streaming model. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2013, New Orleans, Louisiana, USA, January 6-8, 2013, pages 1679-1697, 2013.
Shahbaz Khan. Near Optimal Parallel Algorithms for Dynamic DFS in Undirected Graphs. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2017, Washington DC, USA, July 24-26, 2017, pages 283-292, 2017.
Lasse Kliemann. Engineering a Bipartite Matching Algorithm in the Semi-Streaming Model. In Algorithm Engineering - Selected Results and Surveys, pages 352-378. Springer International Publishing, 2016.
Andrew McGregor. Finding Graph Matchings in Data Streams. In Approximation, Randomization and Combinatorial Optimization, Algorithms and Techniques, 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2005 and 9th InternationalWorkshop on Randomization and Computation, RANDOM 2005, Berkeley, CA, USA, August 22-24, 2005, Proceedings, pages 170-181, 2005.
Andrew McGregor. Graph Stream Algorithms: A Survey. SIGMOD Rec., 43(1):9-20, May 2014.
Shan Muthukrishnan. Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science, 1(2):117-236, 2005.
Thomas C. O'Connell. A Survey of Graph Algorithms Under Extended Streaming Models of Computation. In Fundamental Problems in Computing: Essays in Honor of Professor Daniel J. Rosenkrantz, pages 455-476, 2009.
Jan Matthias Ruhl. Efficient Algorithms for New Computational Models. PhD Thesis, Department of Computer Science, MIT, Cambridge, MA, 2003.
Robert Endre Tarjan. Depth-First Search and Linear Graph Algorithms. SIAM J. Comput., 1(2):146-160, 1972.
Robert Endre Tarjan. Finding Dominators in Directed Graphs. SIAM J. Comput., 3(1):62-89, 1974.
Robert Endre Tarjan. Efficiency of a Good But Not Linear Set Union Algorithm. J. ACM, 22(2):215-225, April 1975.
Robert Endre Tarjan. Edge-Disjoint Spanning Trees and Depth-First Search. Acta Inf., 6:171-185, 1976.
Robert Endre Tarjan and Jan van Leeuwen. Worst-case Analysis of Set Union Algorithms. J. ACM, 31(2):245-281, 1984.
Jeffery Westbrook and Robert Endre Tarjan. Maintaining Bridge-Connected and Biconnected Components On-Line. Algorithmica, 7(5&6):433-464, 1992.
Jian Zhang. A Survey on Streaming Algorithms for Massive Graphs. In Managing and Mining Graph Data, pages 393-420. Springer US, 2010.
Shahbaz Khan and Shashank K. Mehta
Creative Commons Attribution 3.0 Unported license
https://creativecommons.org/licenses/by/3.0/legalcode