Efficient Algorithms for Least Square Piecewise Polynomial Regression

Authors Daniel Lokshtanov, Subhash Suri, Jie Xue



PDF
Thumbnail PDF

File

LIPIcs.ESA.2021.63.pdf
  • Filesize: 0.75 MB
  • 15 pages

Document Identifiers

Author Details

Daniel Lokshtanov
  • Department of Computer Science, University of California, Santa Barbara, CA, USA
Subhash Suri
  • Department of Computer Science, University of California, Santa Barbara, CA, USA
Jie Xue
  • Department of Computer Science, University of California, Santa Barbara, CA, USA

Cite AsGet BibTex

Daniel Lokshtanov, Subhash Suri, and Jie Xue. Efficient Algorithms for Least Square Piecewise Polynomial Regression. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 63:1-63:15, Schloss Dagstuhl – Leibniz-Zentrum fΓΌr Informatik (2021)
https://doi.org/10.4230/LIPIcs.ESA.2021.63

Abstract

We present approximation and exact algorithms for piecewise regression of univariate and bivariate data using fixed-degree polynomials. Specifically, given a set S of n data points (𝐱₁, y₁),… , (𝐱_n, y_n) ∈ ℝ^d Γ— ℝ where d ∈ {1,2}, the goal is to segment 𝐱_i’s into some (arbitrary) number of disjoint pieces P₁, … , P_k, where each piece P_j is associated with a fixed-degree polynomial f_j: ℝ^d β†’ ℝ, to minimize the total loss function Ξ» k + βˆ‘_{i = 1}ⁿ (y_i - f(𝐱_i))Β², where Ξ» β‰₯ 0 is a regularization term that penalizes model complexity (number of pieces) and f: ⨆_{j = 1}^k P_j β†’ ℝ is the piecewise polynomial function defined as f|_{P_j} = f_j. The pieces P₁, … , P_k are disjoint intervals of ℝ in the case of univariate data and disjoint axis-aligned rectangles in the case of bivariate data. Our error approximation allows use of any fixed-degree polynomial, not just linear functions. Our main results are the following. For univariate data, we present a (1 + Ξ΅)-approximation algorithm with time complexity O(n/(Ξ΅) log 1/(Ξ΅)), assuming that data is presented in sorted order of x_i’s. For bivariate data, we present three results: a sub-exponential exact algorithm with running time n^{O(√n)}; a polynomial-time constant-approximation algorithm; and a quasi-polynomial time approximation scheme (QPTAS). The bivariate case is believed to be NP-hard in the folklore but we could not find a published record in the literature, so in this paper we also present a hardness proof for completeness.

Subject Classification

ACM Subject Classification
  • Theory of computation β†’ Computational geometry
Keywords
  • regression analysis
  • piecewise polynomial
  • least square error

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Jayadev Acharya, Ilias Diakonikolas, Chinmay Hegde, Jerry Zheng Li, and Ludwig Schmidt. Fast and near-optimal algorithms for approximating distributions by histograms. In Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 249-263, 2015. Google Scholar
  2. Jayadev Acharya, Ilias Diakonikolas, Jerry Li, and Ludwig Schmidt. Fast algorithms for segmented regression. In International Conference on Machine Learning, pages 2878-2886, 2016. Google Scholar
  3. Anna Adamaszek, Sariel Har-Peled, and Andreas Wiese. Approximation schemes for independent set and sparse subsets of polygons. Journal of the ACM (JACM), 66(4):1-40, 2019. Google Scholar
  4. Anna Adamaszek and Andreas Wiese. Approximation schemes for maximum weight independent set of rectangles. In 2013 IEEE 54th annual symposium on foundations of computer science, pages 400-409. IEEE, 2013. Google Scholar
  5. Anna Adamaszek and Andreas Wiese. A qptas for maximum weight independent set of polygons with polylogarithmically many vertices. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 645-656. SIAM, 2014. Google Scholar
  6. Pankaj K Agarwal, Sariel Har-Peled, Nabil H Mustafa, and Yusu Wang. Near-linear time approximation algorithms for curve simplification. Algorithmica, 42(3-4):203-219, 2005. Google Scholar
  7. Pankaj K Agarwal and Subhash Suri. Surface approximation and geometric partitions. SIAM Journal on Computing, 27(4):1016-1035, 1998. Google Scholar
  8. Boris Aronov, Tetsuo Asano, Naoki Katoh, Kurt Mehlhorn, and Takeshi Tokuyama. Polyline fitting of planar points under min-sum criteria. International journal of computational geometry & applications, 16(02n03):97-116, 2006. Google Scholar
  9. Ilias Diakonikolas, Jerry Li, and Anastasia Voloshinov. Efficient algorithms for multidimensional segmented regression. arXiv preprint arXiv:2003.11086, 2020. Google Scholar
  10. Michael T Goodrich. Efficient piecewise-linear function approximation using the uniform metric: (preliminary version). In Proceedings of the tenth annual Symposium on Computational geometry, pages 322-331, 1994. Google Scholar
  11. Sudipto Guha. On the space-time of optimal, approximate and streaming algorithms for synopsis construction problems. The VLDB Journal, 17(6):1509-1535, 2008. Google Scholar
  12. S Louis Hakimi and Edward F Schmeichel. Fitting polygonal functions to a set of points in the plane. CVGIP: Graphical Models and Image Processing, 53(2):132-136, 1991. Google Scholar
  13. Sariel Har-Peled. Quasi-polynomial time approximation scheme for sparse subsets of polygons. In Proceedings of the thirtieth annual symposium on Computational geometry, pages 120-129, 2014. Google Scholar
  14. Hosagrahar Visvesvaraya Jagadish, Nick Koudas, S Muthukrishnan, Viswanath Poosala, Kenneth C Sevcik, and Torsten Suel. Optimal histograms with quality guarantees. In VLDB, volume 98, pages 24-27, 1998. Google Scholar
  15. Jon Kleinberg and Eva Tardos. Algorithm design. Pearson Education India, 2006. Google Scholar
  16. Frederick Mosteller, John Wilder Tukey, et al. Data analysis and regression: a second course in statistics. Pearson, 1977. Google Scholar
  17. DP Wang, NF Huang, HS Chao, and Richard CT Lee. Plane sweep algorithms for the polygonal approximation problems with applications. In International Symposium on Algorithms and Computation, pages 515-522. Springer, 1993. Google Scholar