Random Projections for Curves in High Dimensions

Authors Ioannis Psarros , Dennis Rohde



PDF
Thumbnail PDF

File

LIPIcs.SoCG.2023.53.pdf
  • Filesize: 0.69 MB
  • 15 pages

Document Identifiers

Author Details

Ioannis Psarros
  • Athena Research Center, Marousi, Greece
Dennis Rohde
  • Universität Bonn, Germany

Cite AsGet BibTex

Ioannis Psarros and Dennis Rohde. Random Projections for Curves in High Dimensions. In 39th International Symposium on Computational Geometry (SoCG 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 258, pp. 53:1-53:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.SoCG.2023.53

Abstract

Modern time series analysis requires the ability to handle datasets that are inherently high-dimensional; examples include applications in climatology, where measurements from numerous sensors must be taken into account, or inventory tracking of large shops, where the dimension is defined by the number of tracked items. The standard way to mitigate computational issues arising from the high dimensionality of the data is by applying some dimension reduction technique that preserves the structural properties of the ambient space. The dissimilarity between two time series is often measured by "discrete" notions of distance, e.g. the dynamic time warping or the discrete Fréchet distance. Since all these distance functions are computed directly on the points of a time series, they are sensitive to different sampling rates or gaps. The continuous Fréchet distance offers a popular alternative which aims to alleviate this by taking into account all points on the polygonal curve obtained by linearly interpolating between any two consecutive points in a sequence. We study the ability of random projections à la Johnson and Lindenstrauss to preserve the continuous Fréchet distance of polygonal curves by effectively reducing the dimension. In particular, we show that one can reduce the dimension to O(ε^{-2} log N), where N is the total number of input points while preserving the continuous Fréchet distance between any two determined polygonal curves within a factor of 1± ε. We conclude with applications on clustering.

Subject Classification

ACM Subject Classification
  • Theory of computation → Computational geometry
  • Theory of computation → Random projections and metric embeddings
Keywords
  • polygonal curves
  • time series
  • dimension reduction
  • Johnson-Lindenstrauss lemma
  • Fréchet distance

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Dimitris Achlioptas. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences, 66(4):671-687, 2003. Google Scholar
  2. Peyman Afshani and Anne Driemel. On the complexity of range searching among curves. In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 898-917. SIAM, 2018. Google Scholar
  3. Pankaj K. Agarwal, Kyle Fox, Jiangwei Pan, and Rex Ying. Approximating Dynamic Time Warping and Edit Distance for a Pair of Point Sequences. In Sándor P. Fekete and Anna Lubiw, editors, 32nd International Symposium on Computational Geometry, SoCG, June 14-18, Boston, MA, USA, volume 51 of LIPIcs, pages 6:1-6:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. Google Scholar
  4. Nir Ailon and Bernard Chazelle. The Fast Johnson-Lindenstrauss Transform and Approximate Nearest Neighbors. SIAM Journal on Computing, 39(1):302-322, 2009. Google Scholar
  5. Noga Alon. Problems and results in extremal combinatorics-I. Discrete Mathematics, 273(1-3):31-53, 2003. Google Scholar
  6. Helmut Alt and Michael Godau. Computing the Fréchet Distance between two Polygonal Curves. International Journal of Computational Geometry & Applications, 5:75-91, 1995. Google Scholar
  7. Rosa I. Arriaga and Santosh S. Vempala. An Algorithmic Theory of Learning: Robust Concepts and Random Projection. In 40th Annual Symposium on Foundations of Computer Science, FOCS '99, 17-18 October, 1999, New York, NY, USA, pages 616-623. IEEE Computer Society, 1999. Google Scholar
  8. Richard G. Baraniuk and Michael B. Wakin. Random projections of smooth manifolds. Found. Comput. Math., 9(1):51-77, 2009. Google Scholar
  9. Karl Bringmann. Why Walking the Dog Takes Time: Frechet Distance Has No Strongly Subquadratic Algorithms Unless SETH Fails. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS, Philadelphia, PA, USA, October 18-21, pages 661-670. IEEE Computer Society, 2014. Google Scholar
  10. Kevin Buchin, Anne Driemel, Joachim Gudmundsson, Michael Horton, Irina Kostitsyna, Maarten Löffler, and Martijn Struijs. Approximating (k, 𝓁)-center clustering for curves. In Timothy M. Chan, editor, Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, pages 2922-2938. SIAM, 2019. Google Scholar
  11. Kevin Buchin, Anne Driemel, and Martijn Struijs. On the Hardness of Computing an Average Curve. In Susanne Albers, editor, 17th Scandinavian Symposium and Workshops on Algorithm Theory, volume 162 of LIPIcs, pages 19:1-19:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. Google Scholar
  12. Kevin Buchin, Anne Driemel, Natasja van de L'Isle, and André Nusser. klcluster: Center-based Clustering of Trajectories. In Proceedings of the 27superscriptth ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 496-499, 2019. Google Scholar
  13. Maike Buchin, Anne Driemel, and Dennis Rohde. Approximating (k,𝓁)-Median Clustering for Polygonal Curves. In Dániel Marx, editor, Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, SODA, Virtual Conference, January 10 - 131, pages 2697-2717. SIAM, 2021. Google Scholar
  14. Sanjoy Dasgupta and Anupam Gupta. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures and Algorithms, 22(1):60-65, January 2003. Google Scholar
  15. Anne Driemel and Sariel Har-Peled. Jaywalking Your Dog: Computing the Fréchet Distance with Shortcuts. SIAM Journal on Computing, 42(5):1830-1866, 2013. Google Scholar
  16. Anne Driemel, Sariel Har-Peled, and Carola Wenk. Approximating the Fréchet Distance for Realistic Curvesin Near Linear Time. Discrete & Computational Geometry, 48(1):94-127, 2012. Google Scholar
  17. Anne Driemel and Amer Krivosija. Probabilistic Embeddings of the Fréchet Distance. In Proceedings of the 16superscriptth International Workshop on Approximation and Online Algorithms (WAOA), pages 218-237, 2018. Google Scholar
  18. Anne Driemel, Amer Krivosija, and Christian Sohler. Clustering time series under the Fréchet distance. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, pages 766-785, 2016. Google Scholar
  19. Anne Driemel, André Nusser, Jeff M. Phillips, and Ioannis Psarros. The VC Dimension of Metric Balls under Fréchet and Hausdorff Distances. Discrete & Computational Geometry, 66(4):1351-1381, 2021. Google Scholar
  20. Arnold Filtser, Omrit Filtser, and Matthew J. Katz. Approximate Nearest Neighbor for Curves - Simple, Efficient, and Deterministic. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, 47th International Colloquium on Automata, Languages, and Programming, ICALP 2020, July 8-11, 2020, Saarbrücken, Germany (Virtual Conference), volume 168 of LIPIcs, pages 48:1-48:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. Google Scholar
  21. Péter Frankl and Hiroshi Maehara. The Johnson-Lindenstrauss lemma and the sphericity of some graphs. Journal of Combinatorial Theory, Series B, 44(3):355-362, 1988. Google Scholar
  22. Piotr Indyk and Rajeev Motwani. Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. In Jeffrey Scott Vitter, editor, Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23-26, 1998, pages 604-613. ACM, 1998. Google Scholar
  23. William B Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26(1):189-206, 1984. Google Scholar
  24. Daniel M. Kane and Jelani Nelson. Sparser Johnson-Lindenstrauss Transforms. Journal of the ACM, 61(1):4:1-4:23, 2014. Google Scholar
  25. Kasper Green Larsen and Jelani Nelson. The Johnson-Lindenstrauss Lemma Is Optimal for Linear Dimensionality Reduction. In Ioannis Chatzigiannakis, Michael Mitzenmacher, Yuval Rabani, and Davide Sangiorgi, editors, 43rd International Colloquium on Automata, Languages, and Programming, ICALP, July 11-15, Rome, Italy, volume 55 of LIPIcs, pages 82:1-82:11. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. Google Scholar
  26. Kasper Green Larsen and Jelani Nelson. Optimality of the Johnson-Lindenstrauss Lemma. In Chris Umans, editor, 58th IEEE Annual Symposium on Foundations of Computer Science, FOCS, Berkeley, CA, USA, October 15-17, pages 633-638. IEEE Computer Society, 2017. Google Scholar
  27. T. Warren Liao. Clustering of time series data - a survey. Pattern Recognition, 38(11):1857-1874, 2005. Google Scholar
  28. Nathan Linial, Eran London, and Yuri Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15(2):215-245, June 1995. Google Scholar
  29. Avner Magen. Dimensionality Reductions That Preserve Volumes and Distance to Affine Spaces, and Their Algorithmic Applications. In José D. P. Rolim and Salil P. Vadhan, editors, Randomization and Approximation Techniques, 6superscriptth International Workshop, RANDOM, Cambridge, MA, USA, September 13-15, Proceedings, volume 2483 of Lecture Notes in Computer Science, pages 239-253. Springer, 2002. Google Scholar
  30. Avner Magen. Dimensionality Reductions in l_2 that Preserve Volumes and Distance to Affine Spaces. Discrete & Computational Geometry, 38(1):139-153, 2007. Google Scholar
  31. Konstantin Makarychev, Yury Makarychev, and Ilya P. Razenshteyn. Performance of Johnson-Lindenstrauss transform for k-means and k-medians clustering. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 1027-1038. ACM, 2019. Google Scholar
  32. Stefan Meintrup, Alexander Munteanu, and Dennis Rohde. Random Projections and Sampling Algorithms for Clustering of High-Dimensional Polygonal Curves. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d'Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS, December 8-14, Vancouver, BC, Canada, pages 12807-12817, 2019. Google Scholar
  33. Shyam Narayanan and Jelani Nelson. Optimal terminal dimensionality reduction in Euclidean space. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019, pages 1064-1069. ACM, 2019. Google Scholar
  34. Christos H. Papadimitriou, Prabhakar Raghavan, Hisao Tamaki, and Santosh S. Vempala. Latent Semantic Indexing: A Probabilistic Analysis. Journal of Computer and System Sciences, 61(2):217-235, 2000. Google Scholar
  35. Ioannis Psarros and Dennis Rohde. Random projections for curves in high dimensions. CoRR, abs/2207.07442, 2022. URL: https://arxiv.org/abs/2207.07442.
  36. Tamás Sarlós. Improved Approximation Algorithms for Large Matrices via Random Projections. In Proceedings of the 47superscriptth Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 143-152, 2006. Google Scholar