On the Complexity of Computing Time Series Medians Under the Move-Split-Merge Metric

Authors Jana Holznigenkemper, Christian Komusiewicz , Nils Morawietz, Bernhard Seeger



PDF
Thumbnail PDF

File

LIPIcs.MFCS.2023.54.pdf
  • Filesize: 0.85 MB
  • 15 pages

Document Identifiers

Author Details

Jana Holznigenkemper
  • Fachbereich Mathematik und Informatik, Philipps-Universität Marburg, Germany
Christian Komusiewicz
  • Institute of Computer Science, Friedrich-Schiller-Universität Jena, Germany
Nils Morawietz
  • Fachbereich Mathematik und Informatik, Philipps-Universität Marburg, Germany
Bernhard Seeger
  • Fachbereich Mathematik und Informatik, Philipps-Universität Marburg, Germany

Cite AsGet BibTex

Jana Holznigenkemper, Christian Komusiewicz, Nils Morawietz, and Bernhard Seeger. On the Complexity of Computing Time Series Medians Under the Move-Split-Merge Metric. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 54:1-54:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.MFCS.2023.54

Abstract

We initiate a study of the complexity of MSM-Median, the problem of computing a median of a set of k real-valued time series under the move-split-merge distance. This distance measure is based on three operations: moves, which may shift a data point in a time series; splits, which replace one data point in a time series by two consecutive data points of the same value; and merges, which replace two consecutive data points of equal value by a single data point of the same value. The cost of a move operation is the difference of the data point value before and after the operation, the cost of split and merge operations is defined via a given constant c. Our main results are as follows. First, we show that MSM-Median is NP-hard and W[1]-hard with respect to k for time series with at most three distinct values. Under the Exponential Time Hypothesis (ETH) our reduction implies that a previous dynamic programming algorithm with running time |I|^𝒪(k) [Holznigenkemper et al., Data Min. Knowl. Discov. '23] is essentially optimal. Here, |I| denotes the total input size. Second, we show that MSM-Median can be solved in 2^𝒪(d/c)⋅|I|^𝒪(1) time where d is the total distance of the median to the input time series.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Time series analysis
Keywords
  • Parameterized Complexity
  • Median String
  • Time Series
  • ETH

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Saeed Reza Aghabozorgi, Ali Seyed Shirkhorshidi, and Ying Wah Teh. Time-series clustering - A decade review. Inf. Syst., 53:16-38, 2015. URL: https://doi.org/10.1016/j.is.2015.04.007.
  2. Markus Brill, Till Fluschnik, Vincent Froese, Brijnesh J. Jain, Rolf Niedermeier, and David Schultz. Exact mean computation in dynamic time warping spaces. Data Min. Knowl. Discov., 33(1):252-291, 2019. URL: https://doi.org/10.1007/s10618-018-0604-8.
  3. Karl Bringmann and Marvin Künnemann. Quadratic conditional lower bounds for string problems and dynamic time warping. In Proccedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS '15), pages 79-97. IEEE Computer Society, 2015. Google Scholar
  4. Laurent Bulteau, Vincent Froese, and Rolf Niedermeier. Tight hardness results for consensus problems on circular strings and time series. SIAM J. Discret. Math., 34(3):1854-1883, 2020. URL: https://doi.org/10.1137/19M1255781.
  5. Jana Holznigenkemper, Christian Komusiewicz, and Bernhard Seeger. Exact and heuristic approaches to speeding up the MSM time series distance computation. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM '23), pages 451-459. SIAM, 2023. URL: https://doi.org/10.1137/1.9781611977653.ch51.
  6. Jana Holznigenkemper, Christian Komusiewicz, and Bernhard Seeger. On computing exact means of time series using the move-split-merge metric. Data Min. Knowl. Discov., 37(2):595-626, 2023. Google Scholar
  7. Russell Impagliazzo, Ramamohan Paturi, and Francis Zane. Which problems have strongly exponential complexity? J. Comput. Syst. Sci., 63(4):512-530, 2001. URL: https://doi.org/10.1006/jcss.2001.1774.
  8. Weiwei Jiang. Time series classification: Nearest neighbor versus deep learning models. SN Appl. Sci., 2(4):721, 2020. URL: https://doi.org/10.1007/s42452-020-2506-9.
  9. Jason Lines and Anthony Bagnall. Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov., 29:565-592, 2015. URL: https://doi.org/10.1007/s10618-014-0361-2.
  10. John Paparrizos and Luis Gravano. Fast and accurate time-series clustering. ACM Trans. Database Syst., 42(2):8:1-8:49, 2017. URL: https://doi.org/10.1145/3044711.
  11. John Paparrizos, Chunwei Liu, Aaron J. Elmore, and Michael J. Franklin. Debunking four long-standing misconceptions of time-series distance measures. In Proceedings of the 2020 International Conference on Management of Data (SIGMOD '20), pages 1887-1905. ACM, 2020. URL: https://doi.org/10.1145/3318464.3389760.
  12. Alexandra Stefan, Vassilis Athitsos, and Gautam Das. The move-split-merge metric for time series. IEEE Trans. Knowl. Data Eng., 25(6):1425-1438, 2013. URL: https://doi.org/10.1109/TKDE.2012.88.