Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Streaming Algorithms for Geometric Steiner Forest (2011.04324v4)

Published 9 Nov 2020 in cs.DS

Abstract: We consider an important generalization of the Steiner tree problem, the \emph{Steiner forest problem}, in the Euclidean plane: the input is a multiset $X \subseteq \mathbb{R}2$, partitioned into $k$ color classes $C_1, C_2, \ldots, C_k \subseteq X$. The goal is to find a minimum-cost Euclidean graph $G$ such that every color class $C_i$ is connected in $G$. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to $X$. Each input point $x\in X$ arrives with its color $\textsf{color}(x) \in [k]$, and as usual for dynamic geometric streams, the input points are restricted to the discrete grid ${0, \ldots, \Delta}2$. We design a single-pass streaming algorithm that uses $\mathrm{poly}(k \cdot \log\Delta)$ space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio $\alpha_2$ (currently $1.1547 \le \alpha_2 \le 1.214$). This approximation guarantee matches the state-of-the-art bound for streaming Steiner tree, i.e., when $k=1$, and it is a major open question to improve the ratio to $1 + \epsilon$ even for this special case. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting. We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires $\Omega(k)$ bits of space.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Efficient sketches for earth-mover distance, with applications. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 324–330, 2009. doi:10.1109/FOCS.2009.25.
  2. Balanced partition of minimum spanning trees. International Journal of Computational Geometry and Applications, 13(4):303–316, 2003.
  3. Earth mover distance over high-dimensional spaces. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 343–352, 2008. URL: http://dl.acm.org/citation.cfm?id=1347082.1347120.
  4. When trees collide: An approximation algorithm for the generalized Steiner problem on networks. SIAM Journal on Computing, 24(3):440–456, 1995.
  5. Width of points in the streaming model. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 447–452, 2012.
  6. Parallel algorithms for geometric graph problems. In Proceedings of the 46th Annual ACM Symposium on Theory of Computing (STOC), pages 574–583, 2014.
  7. Sanjeev Arora. Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. Journal of the ACM, 45(5):753–782, 1998. doi:10.1145/290179.290180.
  8. Approximation schemes for Euclidean k𝑘kitalic_k-medians and related problems. In Proceedings of the 13th Annual ACM Symposium on the Theory of Computing (STOC), pages 106–113, 1998.
  9. Pankaj K. Agarwal and R. Sharathkumar. Streaming algorithms for extent problems in high dimensions. Algorithmica, 72(1):83–98, 2015. doi:10.1007/s00453-013-9846-4.
  10. Yair Bartal. Probabilistic approximations of metric spaces and its algorithmic applications. In Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 184–193, 1996.
  11. Almost optimal streaming algorithms for coverage problems. In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 13–23, 2017.
  12. Clustering high dimensional dynamic data streams. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 576–585, 2017.
  13. Euclidean prize-collecting Steiner forest. Algorithmica, 62(3-4):906–929, 2012.
  14. Approximation schemes for Steiner forest on planar graphs and graphs of bounded treewidth. Journal of the ACM, 58(5):21:1–21:37, 2011.
  15. A polynomial-time approximation scheme for Euclidean Steiner forest. ACM Transactions of Algorithms, 11(3):19:1–19:20, 2015.
  16. Edit distance: Sketching, streaming, and document exchange. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 51–60, 2016. doi:10.1109/FOCS.2016.15.
  17. Streaming Euclidean MST to a constant factor. In Proceedings of the 55th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 156–169, 2023.
  18. A unifying framework for ℓ0subscriptℓ0\ell_{0}roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT-sampling algorithms. Distributed Parallel Databases, 32(3):315–335, 2014. doi:10.1007/s10619-013-7131-9.
  19. Streaming and small space approximation algorithms for edit distance and longest common subsequence. In Proceedings of the 48th International Colloquium on Automata, Languages, and Programming (ICALP), pages 54:1–54:20, 2021.
  20. A new bound for Euclidean Steiner minimal trees. Annals of the New York Academy of Sciences, 440(1):328–346, 1985.
  21. Streaming algorithms for embedding and computing edit distance in the low distance regime. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing (STOC), pages 712–725, 2016.
  22. Timothy M. Chan. Faster core-set constructions and data-stream algorithms in fixed dimensions. Computation Geometry, 35(1-2):20–35, 2006.
  23. Timothy M. Chan. Dynamic streaming algorithms for ε𝜀\varepsilonitalic_ε-kernels. In Proceedings of the 32nd International Symposium on Computational Geometry (SoCG), pages 27:1–27:11, 2016.
  24. A PTAS for the Steiner forest problem in doubling metrics. SIAM Journal on Computing, 47(4):1705–1734, 2018.
  25. Streaming facility location in high dimension via geometric hashing. In Proceedings of the 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 462–473, 2022. arXiv:2204.02095, doi:10.1109/FOCS54457.2022.00050.
  26. Streaming Euclidean Max-Cut: Dimension vs data reduction. In Proceedings of the 55th Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 170–182. ACM, 2023.
  27. New streaming algorithms for high dimensional EMD and MST. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing (STOC), page 222–233, 2022. doi:10.1145/3519935.3519979.
  28. (1+ε)1𝜀(1+\varepsilon)( 1 + italic_ε )-approximation for facility location in data streams. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1710–1728, 2013. doi:10.1137/1.9781611973105.123.
  29. Graham Cormode and S. Muthukrishnan. Combinatorial algorithms for compressed sensing. In Proceedings of the 13th International Colloquium on Structural Information and Communication Complexity (SIROCCO), pages 280–294, 2006.
  30. Approximating the minimum spanning tree weight in sublinear time. SIAM Journal on Computing, 34(6):1370–1379, 2005.
  31. On the monotonicity of a data stream. Combinatorica, 35(6):641–653, 2015.
  32. Sampling in dynamic data streams and applications. International Journal of Computational Geometry and Applications, 18(1/2):3–28, 2008. doi:10.1142/S0218195908002520.
  33. Computing diameter in the streaming and sliding-window models. Algorithmica, 41(1):25–41, 2005. doi:10.1007/s00453-004-1105-2.
  34. Coresets in dynamic geometric data streams. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing (STOC), pages 209–217, 2005. doi:10.1145/1060590.1060622.
  35. A local-search algorithm for Steiner forest. In Proceedings of the 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), pages 31:1–31:17, 2018.
  36. Estimating the sortedness of a data stream. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 318–327, 2007. URL: https://dl.acm.org/doi/10.5555/1283383.1283417.
  37. Greedy algorithms for Steiner forest. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC), pages 871–878, 2015.
  38. Steiner minimal trees. SIAM Journal on Applied Mathematics, 16(1):1–29, 1968.
  39. A general approximation technique for constrained forest problems. SIAM Journal on Computing, 24(2):296–317, 1995.
  40. Sariel Har-Peled. Geometric Approximation Algorithms. American Mathematical Society, USA, 2011.
  41. On coresets for k𝑘kitalic_k-means and k𝑘kitalic_k-median clustering. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 291–300, 2004.
  42. Nearly optimal dynamic k𝑘kitalic_k-means clustering for high-dimensional data, 2019. arXiv:1802.00459.
  43. Piotr Indyk. Algorithms for dynamic geometric problems over data streams. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing (STOC), pages 373–380, 2004. doi:10.1145/1007352.1007413.
  44. Fast image retrieval via embeddings. In Proceedings of the 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV), 2003. URL: https://people.csail.mit.edu/indyk/emd.pdf.
  45. Kamal Jain. A factor 2 approximation algorithm for the generalized Steiner network problem. Combinatorica, 21(1):39–60, 2001.
  46. The one-way communication complexity of Hamming distance. Theory of Computing, 4(6):129–135, 2008. doi:10.4086/toc.2008.v004a006.
  47. Communication Complexity. Cambridge University Press, 1997.
  48. Fast moment estimation in data streams in optimal space. In Proceedings of the 43rd Annual ACM Symposium on Theory of Computing (STOC), pages 745–754, 2011.
  49. On randomized one-round communication complexity. Computational Complexity, 8(1):21–49, 1999. doi:10.1007/s000370050018.
  50. An optimal algorithm for the distinct elements problem. In Proceedings of the 29th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 41–52, 2010. doi:10.1145/1807085.1807094.
  51. Heavy hitters via cluster-preserving clustering. Communications of the ACM, 62(8):95–100, 2019.
  52. Facility location in dynamic geometric data streams. In Proceedings of the 16th Annual European Symposium on Algorithms (ESA), pages 660–671, 2008. doi:10.1007/978-3-540-87744-8_55.
  53. Streaming embeddings with slack. In Proceedings of the 11th International Symposium on Algorithms and Data Structures (WADS), pages 483–494, 2009. doi:10.1007/978-3-642-03367-4_42.
  54. Joseph S. B. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric TSP, k𝑘kitalic_k-MST, and related problems. SIAM Journal on Computing, 28(4):1298–1309, 1999.
  55. Chapter 9: Optimal trees. In Network Models, volume 7 of Handbooks in Operations Research and Management Science, pages 503–615. Elsevier, 1995. doi:10.1016/S0927-0507(05)80126-4.
  56. David Pollard. Empirical Processes: Theory and Applications, chapter 4: Packing and Covering in Euclidean Spaces, pages 14–20. IMS, 1990. doi:10.1214/cbms/1462061091.
  57. Guido Schäfer. Steiner forest. In Ming-Yang Kao, editor, Encyclopedia of Algorithms, pages 2099–2102. Springer, 2016. doi:10.1007/978-1-4939-2864-4_402.
  58. Christian Sohler. Problem 52: TSP in the streaming model. https://sublinear.info/52, 2012.
  59. Michael Saks and C. Seshadhri. Space efficient streaming algorithms for the distance to monotonicity and asymmetric edit distance. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1698–1709, 2013. doi:10.1137/1.9781611973105.122.
  60. The communication and streaming complexity of computing the longest common and increasing subsequences. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 336–345, 2007.
  61. Constrained k𝑘kitalic_k-means clustering with background knowledge. In Proceedings of the 18th International Conference on Machine Learning (ICML), pages 577–584, 2001.
  62. High-dimensional geometric streaming in polynomial space. In Proceedings of the 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 732–743, 2022. arXiv:2204.03790, doi:10.1109/FOCS54457.2022.00075.
  63. Greedy splitting algorithms for approximating multiway partition problems. Mathematical Programming, Series A, 102(1):167–183, 2005.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com