Streaming Algorithms for Geometric Steiner Forest

Authors Artur Czumaj, Shaofeng H.-C. Jiang , Robert Krauthgamer, Pavel Veselý



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2022.47.pdf
  • Filesize: 0.89 MB
  • 20 pages

Document Identifiers

Author Details

Artur Czumaj
  • University of Warwick, Coventry, UK
Shaofeng H.-C. Jiang
  • Peking University, Beijing, China
Robert Krauthgamer
  • Weizmann Institute of Science, Rehovot, Israel
Pavel Veselý
  • Charles University, Prague, Czech Republic

Cite AsGet BibTex

Artur Czumaj, Shaofeng H.-C. Jiang, Robert Krauthgamer, and Pavel Veselý. Streaming Algorithms for Geometric Steiner Forest. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 47:1-47:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/LIPIcs.ICALP.2022.47

Abstract

We consider an important generalization of the Steiner tree problem, the Steiner forest problem, in the Euclidean plane: the input is a multiset X ⊆ ℝ², partitioned into k color classes C₁, C₂, …, Cₖ ⊆ X. The goal is to find a minimum-cost Euclidean graph G such that every color class Cᵢ is connected in G. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to X. Each input point x ∈ X arrives with its color color(x) ∈ [k], and as usual for dynamic geometric streams, the input is restricted to the discrete grid {0, …, Δ}². We design a single-pass streaming algorithm that uses poly(k ⋅ log Δ) space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio α₂ (currently 1.1547 ≤ α₂ ≤ 1.214). This approximation guarantee matches the state of the art bound for streaming Steiner tree, i.e., when k = 1. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting. We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires Ω(k) bits of space.

Subject Classification

ACM Subject Classification
  • Theory of computation → Sketching and sampling
  • Theory of computation → Streaming models
  • Theory of computation → Computational geometry
Keywords
  • Steiner forest
  • streaming
  • sublinear algorithms
  • dynamic programming

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ajit Agrawal, Philip N. Klein, and R. Ravi. When trees collide: An approximation algorithm for the generalized Steiner problem on networks. SIAM Journal on Computing, 24(3):440-456, 1995. Google Scholar
  2. Mattias Andersson, Joachim Gudmundsson, Christos Levcopoulos, and Giri Narasimhan. Balanced partition of minimum spanning trees. International Journal of Computational Geometry and Applications, 13(4):303-316, 2003. Google Scholar
  3. Alexandr Andoni, Khanh Do Ba, Piotr Indyk, and David P. Woodruff. Efficient sketches for earth-mover distance, with applications. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 324-330, 2009. Google Scholar
  4. Alexandr Andoni, Piotr Indyk, and Robert Krauthgamer. Earth mover distance over high-dimensional spaces. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 343-352, 2008. Google Scholar
  5. Alexandr Andoni and Huy L. Nguyen. Width of points in the streaming model. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 447-452, 2012. Google Scholar
  6. Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. Parallel algorithms for geometric graph problems. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC), pages 574-583, 2014. Google Scholar
  7. Sanjeev Arora. Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. Journal of the ACM, 45(5):753-782, 1998. Google Scholar
  8. Sanjeev Arora, Prabhakar Raghavan, and Satish Rao. Approximation schemes for Euclidean k-medians and related problems. In Proceedings of the 13th Annual ACM Symposium on the Theory of Computing (STOC), pages 106-113, 1998. Google Scholar
  9. Yair Bartal. Probabilistic approximations of metric spaces and its algorithmic applications. In Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 184-193, 1996. Google Scholar
  10. MohammadHossein Bateni, Hossein Esfandiari, and Vahab S. Mirrokni. Almost optimal streaming algorithms for coverage problems. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 13-23, 2017. Google Scholar
  11. MohammadHossein Bateni and MohammadTaghi Hajiaghayi. Euclidean prize-collecting Steiner forest. Algorithmica, 62(3-4):906-929, 2012. Google Scholar
  12. MohammadHossein Bateni, MohammadTaghi Hajiaghayi, and Dániel Marx. Approximation schemes for Steiner forest on planar graphs and graphs of bounded treewidth. Journal of the ACM, 58(5):21:1-21:37, 2011. Google Scholar
  13. Djamal Belazzougui and Qin Zhang. Edit distance: Sketching, streaming, and document exchange. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 51-60, 2016. Google Scholar
  14. Glencora Borradaile, Philip N. Klein, and Claire Mathieu. A polynomial-time approximation scheme for Euclidean Steiner forest. ACM Transactions of Algorithms, 11(3):19:1-19:20, 2015. Google Scholar
  15. Vladimir Braverman, Gereon Frahling, Harry Lang, Christian Sohler, and Lin F. Yang. Clustering high dimensional dynamic data streams. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 576-585, 2017. Google Scholar
  16. Diptarka Chakraborty, Elazar Goldenberg, and Michal Koucký. Streaming algorithms for embedding and computing edit distance in the low distance regime. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC), pages 712-725, 2016. Google Scholar
  17. T.-H. Hubert Chan, Shuguang Hu, and Shaofeng H.-C. Jiang. A PTAS for the Steiner forest problem in doubling metrics. SIAM Journal on Computing, 47(4):1705-1734, 2018. Google Scholar
  18. Timothy M. Chan. Faster core-set constructions and data-stream algorithms in fixed dimensions. Computation Geometry, 35(1-2):20-35, 2006. Google Scholar
  19. Timothy M. Chan. Dynamic streaming algorithms for ε-kernels. In Proceedings of the 32nd International Symposium on Computational Geometry (SoCG), pages 27:1-27:11, 2016. Google Scholar
  20. Moses Charikar. Similarity estimation techniques from rounding algorithms. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC), pages 380-388, 2002. Google Scholar
  21. Bernard Chazelle, Ronitt Rubinfeld, and Luca Trevisan. Approximating the minimum spanning tree weight in sublinear time. SIAM Journal on Computing, 34(6):1370-1379, 2005. Google Scholar
  22. Kuan Cheng, Alireza Farhadi, MohammadTaghi Hajiaghayi, Zhengzhong Jin, Xin Li, Aviad Rubinstein, Saeed Seddighin, and Yu Zheng. Streaming and small space approximation algorithms for edit distance and longest common subsequence. In Proceedings of the 48th International Colloquium on Automata, Languages, and Programming (ICALP), pages 54:1-54:20, 2021. Google Scholar
  23. F. R. K. Chung and R. L. Graham. A new bound for Euclidean Steiner minimal trees. Annals of the New York Academy of Sciences, 440(1):328-346, 1985. Google Scholar
  24. Artur Czumaj, Christiane Lammersen, Morteza Monemizadeh, and Christian Sohler. (1+ε)-approximation for facility location in data streams. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1710-1728, 2013. Google Scholar
  25. Funda Ergün and Hossein Jowhari. On the monotonicity of a data stream. Combinatorica, 35(6):641-653, 2015. Google Scholar
  26. Joan Feigenbaum, Sampath Kannan, and Jian Zhang. Computing diameter in the streaming and sliding-window models. Algorithmica, 41(1):25-41, 2005. Google Scholar
  27. Gereon Frahling, Piotr Indyk, and Christian Sohler. Sampling in dynamic data streams and applications. International Journal of Computational Geometry and Applications, 18(1/2):3-28, 2008. Google Scholar
  28. Gereon Frahling and Christian Sohler. Coresets in dynamic geometric data streams. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC), pages 209-217, 2005. Google Scholar
  29. Edgar N. Gilbert and Henry O. Pollak. Steiner minimal trees. SIAM Journal on Applied Mathematics, 16(1):1-29, 1968. Google Scholar
  30. Michel X. Goemans and David P. Williamson. A general approximation technique for constrained forest problems. SIAM Journal on Computing, 24(2):296-317, 1995. Google Scholar
  31. Parikshit Gopalan, T. S. Jayram, Robert Krauthgamer, and Ravi Kumar. Estimating the sortedness of a data stream. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 318-327, 2007. Google Scholar
  32. Martin Groß, Anupam Gupta, Amit Kumar, Jannik Matuschke, Daniel R. Schmidt, Melanie Schmidt, and José Verschae. A local-search algorithm for Steiner forest. In Proceedings of the 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), pages 31:1-31:17, 2018. Google Scholar
  33. Anupam Gupta and Amit Kumar. Greedy algorithms for Steiner forest. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC), pages 871-878, 2015. Google Scholar
  34. Sariel Har-Peled and Soham Mazumdar. On coresets for k-means and k-median clustering. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 291-300, 2004. Google Scholar
  35. Wei Hu, Zhao Song, Lin F. Yang, and Peilin Zhong. Nearly optimal dynamic k-means clustering for high-dimensional data, 2019. URL: http://arxiv.org/abs/1802.00459.
  36. Piotr Indyk. Algorithms for dynamic geometric problems over data streams. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 373-380, 2004. Google Scholar
  37. Piotr Indyk and Nitin Thaper. Fast image retrieval via embeddings. In Proceedings of the 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV), 2003. URL: https://people.csail.mit.edu/indyk/emd.pdf.
  38. Kamal Jain. A factor 2 approximation algorithm for the generalized Steiner network problem. Combinatorica, 21(1):39-60, 2001. Google Scholar
  39. T. S. Jayram, Ravi Kumar, and D. Sivakumar. The one-way communication complexity of Hamming distance. Theory of Computing, 4(6):129-135, 2008. Google Scholar
  40. Daniel M. Kane, Jelani Nelson, Ely Porat, and David P. Woodruff. Fast moment estimation in data streams in optimal space. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing (STOC), pages 745-754, 2011. Google Scholar
  41. Ilan Kremer, Noam Nisan, and Dana Ron. On randomized one-round communication complexity. Computational Complexity, 8(1):21-49, 1999. Google Scholar
  42. Eyal Kushilevitz and Noam Nisan. Communication Complexity. Cambridge University Press, 1997. Google Scholar
  43. Christiane Lammersen and Christian Sohler. Facility location in dynamic geometric data streams. In Proceedings of the 16th Annual European Symposium on Algorithms (ESA), pages 660-671, 2008. Google Scholar
  44. Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen, and Mikkel Thorup. Heavy hitters via cluster-preserving clustering. Communications of the ACM, 62(8):95-100, 2019. Google Scholar
  45. Thomas L. Magnanti and Laurence A. Wolsey. Chapter 9: Optimal trees. In Network Models, volume 7 of Handbooks in Operations Research and Management Science, pages 503-615. Elsevier, 1995. Google Scholar
  46. Joseph S. B. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric TSP, k-MST, and related problems. SIAM Journal on Computing, 28(4):1298-1309, 1999. Google Scholar
  47. David Pollard. Empirical Processes: Theory and Applications, chapter 4: Packing and Covering in Euclidean Spaces, pages 14-20. IMS, 1990. Google Scholar
  48. Michael E. Saks and C. Seshadhri. Space efficient streaming algorithms for the distance to monotonicity and asymmetric edit distance. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1698-1709, 2013. Google Scholar
  49. Guido Schäfer. Steiner Forest. In Ming-Yang Kao, editor, Encyclopedia of Algorithms, pages 2099-2102. Springer, New York, NY, 2016. Google Scholar
  50. Christian Sohler. Problem 52: TSP in the streaming model. https://sublinear.info/52, 2012.
  51. Xiaoming Sun and David P. Woodruff. The communication and streaming complexity of computing the longest common and increasing subsequences. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 336-345, 2007. Google Scholar
  52. Kiri Wagstaff, Claire Cardie, Seth Rogers, and Stefan Schrödl. Constrained k-means clustering with background knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML), pages 577-584, 2001. Google Scholar
  53. Liang Zhao, Hiroshi Nagamochi, and Toshihide Ibaraki. Greedy splitting algorithms for approximating multiway partition problems. Mathematical Programming, Series A, 102(1):167-183, 2005. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail