Document Open Access Logo

Depth First Search in the Semi-streaming Model

Authors Shahbaz Khan , Shashank K. Mehta

Thumbnail PDF


  • Filesize: 0.49 MB
  • 16 pages

Document Identifiers

Author Details

Shahbaz Khan
  • Faculty of Computer Science, University of Vienna, Austria
Shashank K. Mehta
  • Dept. of Computer Science and Engineering, Indian Institute of Technology Kanpur, India

Cite AsGet BibTex

Shahbaz Khan and Shashank K. Mehta. Depth First Search in the Semi-streaming Model. In 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 126, pp. 42:1-42:16, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)


Depth first search (DFS) tree is a fundamental data structure for solving various graph problems. The classical algorithm for building a DFS tree requires O(m+n) time for a given undirected graph G having n vertices and m edges. In the streaming model, an algorithm is allowed several passes (preferably single) over the input graph having a restriction on the size of local space used. Now, a DFS tree of a graph can be trivially computed using a single pass if O(m) space is allowed. In the semi-streaming model allowing O(n) space, it can be computed in O(n) passes over the input stream, where each pass adds one vertex to the DFS tree. However, it remains an open problem to compute a DFS tree using o(n) passes using o(m) space even in any relaxed streaming environment. We present the first semi-streaming algorithms that compute a DFS tree of an undirected graph in o(n) passes using o(m) space. We first describe an extremely simple algorithm that requires at most ceil[n/k] passes to compute a DFS tree using O(nk) space, where k is any positive integer. For example using k=sqrt{n}, we can compute a DFS tree in sqrt{n} passes using O(n sqrt{n}) space. We then improve this algorithm by using more involved techniques to reduce the number of passes to ceil[h/k] under similar space constraints, where h is the height of the computed DFS tree. In particular, this algorithm improves the bounds for the case where the computed DFS tree is shallow (having o(n) height). Moreover, this algorithm is presented in form of a framework that allows the flexibility of using any algorithm to maintain a DFS tree of a stored sparser subgraph as a black box, which may be of an independent interest. Both these algorithms essentially demonstrate the existence of a trade-off between the space and number of passes required for computing a DFS tree. Furthermore, we evaluate these algorithms experimentally which reveals their exceptional performance in practice. For both random and real graphs, they require merely a few passes even when allowed just O(n) space.

Subject Classification

ACM Subject Classification
  • Theory of computation → Graph algorithms analysis
  • Theory of computation → Data structures design and analysis
  • Theory of computation → Streaming, sublinear and near linear time algorithms
  • Depth First Search
  • DFS
  • Semi-Streaming
  • Streaming
  • Algorithm


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Kook Jin Ahn and Sudipto Guha. Linear programming in the semi-streaming model with application to the maximum matching problem. Inf. Comput., 222:59-79, 2013. Google Scholar
  2. Noga Alon, Yossi Matias, and Mario Szegedy. The Space Complexity of Approximating the Frequency Moments. Journal of Computer and System Sciences, 58(1):137-147, 1999. Google Scholar
  3. Giorgio Ausiello, Donatella Firmani, and Luigi Laura. Datastream computation of graph biconnectivity: Articulation Points, Bridges, and Biconnected Components. In Theoretical Computer Science, 11th Italian Conference, ICTCS 2009, Cremona, Italy, September 28-30, 2009, Proceedings., pages 26-29, 2009. Google Scholar
  4. Giorgio Ausiello, Donatella Firmani, and Luigi Laura. Real-time monitoring of undirected networks: Articulation points, bridges, and connected and biconnected components. Networks, 59(3):275-288, 2012. URL:
  5. Ziv Bar-Yossef, Ravi Kumar, and D. Sivakumar. Reductions in Streaming Algorithms, with an Application to Counting Triangles in Graphs. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '02, pages 623-632, 2002. Google Scholar
  6. Surender Baswana. Streaming algorithm for graph spanners - single pass and constant processing time per edge. Inf. Process. Lett., 106(3):110-114, 2008. Google Scholar
  7. Surender Baswana, Shreejit Ray Chaudhury, Keerti Choudhary, and Shahbaz Khan. Dynamic DFS in Undirected Graphs: breaking the O(m) barrier. In ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 730-739, 2016. Google Scholar
  8. Surender Baswana and Shahbaz Khan. Incremental Algorithm for Maintaining a DFS Tree for Undirected Graphs. Algorithmica, 79(2):466-483, 2017. Google Scholar
  9. Glencora Borradaile, Claire Mathieu, and Theresa Migler. Lower bounds for testing digraph connectivity with one-pass streaming algorithms. CoRR, abs/1404.1323, 2014. Google Scholar
  10. Adam L. Buchsbaum, Raffaele Giancarlo, and Jeffery Westbrook. On finding common neighborhoods in massive graphs. Theor. Comput. Sci., 299(1-3):707-718, 2003. Google Scholar
  11. Michael Elkin. Streaming and fully dynamic centralized algorithms for constructing and maintaining sparse spanners. ACM Trans. Algorithms, 7(2):20:1-20:17, 2011. Google Scholar
  12. Michael Elkin. Distributed exact shortest paths in sublinear time. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 757-770, 2017. Google Scholar
  13. Shimon Even and Robert Endre Tarjan. Network Flow and Testing Graph Connectivity. SIAM J. Comput., 4(4):507-518, 1975. Google Scholar
  14. Martin Farach-Colton, Tsan-sheng Hsu, Meng Li, and Meng-Tsung Tsai. Finding Articulation Points of Large Graphs in Linear Time. In Algorithms and Data Structures, WADS, pages 363-372, Cham, 2015. Springer International Publishing. Google Scholar
  15. Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang. Graph Distances in the Streaming Model: The Value of Space. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '05, pages 745-754, 2005. Google Scholar
  16. Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, and Jian Zhang. On Graph Problems in a Semi-streaming Model. Theor. Comput. Sci., 348(2):207-216, December 2005. Google Scholar
  17. Joan Feigenbaum, Sampath Kannan, Martin J. Strauss, and Mahesh Viswanathan. An Approximate L1-Difference Algorithm for Massive Data Streams. SIAM J. Comput., 32(1):131-151, January 2003. Google Scholar
  18. Philippe Flajolet and G. Nigel Martin. Probabilistic counting algorithms for data base applications. Journal of Computer and System Sciences, 31(2):182-209, 1985. Google Scholar
  19. Sudipto Guha, Nick Koudas, and Kyuseok Shim. Data-streams and Histograms. In Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing, STOC '01, pages 471-475, 2001. Google Scholar
  20. Venkatesan Guruswami and Krzysztof Onak. Superlinear Lower Bounds for Multipass Graph Processing. Algorithmica, 76(3):654-683, November 2016. Google Scholar
  21. Venkatesan Guruswami and Krzysztof Onak. Superlinear Lower Bounds for Multipass Graph Processing. Algorithmica, 76(3):654-683, 2016. Google Scholar
  22. Monika Rauch Henzinger, Prabhakar Raghavan, and Sridhar Rajagopalan. Computing on data streams. In External Memory Algorithms, Proceedings of a DIMACS Workshop, New Brunswick, New Jersey, USA, May 20-22, 1998, pages 107-118, 1998. Google Scholar
  23. John E. Hopcroft and Richard M. Karp. An n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs. SIAM J. Comput., 2(4):225-231, 1973. Google Scholar
  24. John E. Hopcroft and Robert Endre Tarjan. Efficient Planarity Testing. J. ACM, 21(4):549-568, 1974. Google Scholar
  25. Piotr Indyk. Stable Distributions, Pseudorandom Generators, Embeddings, and Data Stream Computation. J. ACM, 53(3):307-323, May 2006. Google Scholar
  26. Sagar Kale and Sumedh Tirodkar. Maximum Matching in Two, Three, and a Few More Passes Over Graph Streams. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2017, August 16-18, 2017, Berkeley, CA, USA, pages 15:1-15:21, 2017. Google Scholar
  27. Michael Kapralov. Better bounds for matchings in the streaming model. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2013, New Orleans, Louisiana, USA, January 6-8, 2013, pages 1679-1697, 2013. Google Scholar
  28. Shahbaz Khan. Near Optimal Parallel Algorithms for Dynamic DFS in Undirected Graphs. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2017, Washington DC, USA, July 24-26, 2017, pages 283-292, 2017. Google Scholar
  29. Lasse Kliemann. Engineering a Bipartite Matching Algorithm in the Semi-Streaming Model. In Algorithm Engineering - Selected Results and Surveys, pages 352-378. Springer International Publishing, 2016. Google Scholar
  30. Andrew McGregor. Finding Graph Matchings in Data Streams. In Approximation, Randomization and Combinatorial Optimization, Algorithms and Techniques, 8th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2005 and 9th InternationalWorkshop on Randomization and Computation, RANDOM 2005, Berkeley, CA, USA, August 22-24, 2005, Proceedings, pages 170-181, 2005. Google Scholar
  31. Andrew McGregor. Graph Stream Algorithms: A Survey. SIGMOD Rec., 43(1):9-20, May 2014. Google Scholar
  32. Shan Muthukrishnan. Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science, 1(2):117-236, 2005. Google Scholar
  33. Thomas C. O'Connell. A Survey of Graph Algorithms Under Extended Streaming Models of Computation. In Fundamental Problems in Computing: Essays in Honor of Professor Daniel J. Rosenkrantz, pages 455-476, 2009. Google Scholar
  34. Jan Matthias Ruhl. Efficient Algorithms for New Computational Models. PhD Thesis, Department of Computer Science, MIT, Cambridge, MA, 2003. Google Scholar
  35. Robert Endre Tarjan. Depth-First Search and Linear Graph Algorithms. SIAM J. Comput., 1(2):146-160, 1972. Google Scholar
  36. Robert Endre Tarjan. Finding Dominators in Directed Graphs. SIAM J. Comput., 3(1):62-89, 1974. Google Scholar
  37. Robert Endre Tarjan. Efficiency of a Good But Not Linear Set Union Algorithm. J. ACM, 22(2):215-225, April 1975. Google Scholar
  38. Robert Endre Tarjan. Edge-Disjoint Spanning Trees and Depth-First Search. Acta Inf., 6:171-185, 1976. Google Scholar
  39. Robert Endre Tarjan and Jan van Leeuwen. Worst-case Analysis of Set Union Algorithms. J. ACM, 31(2):245-281, 1984. Google Scholar
  40. Jeffery Westbrook and Robert Endre Tarjan. Maintaining Bridge-Connected and Biconnected Components On-Line. Algorithmica, 7(5&6):433-464, 1992. Google Scholar
  41. Jian Zhang. A Survey on Streaming Algorithms for Massive Graphs. In Managing and Mining Graph Data, pages 393-420. Springer US, 2010. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail