Efficient Algorithms for Least Square Piecewise Polynomial Regression

Lokshtanov, Daniel; Suri, Subhash; Xue, Jie

doi:10.4230/LIPIcs.ESA.2021.63

File

LIPIcs.ESA.2021.63.pdf

Filesize: 0.75 MB
15 pages

Document Identifiers

DOI: 10.4230/LIPIcs.ESA.2021.63
URN: urn:nbn:de:0030-drops-146443

Author Details

Daniel Lokshtanov

Department of Computer Science, University of California, Santa Barbara, CA, USA

Subhash Suri

Department of Computer Science, University of California, Santa Barbara, CA, USA

Jie Xue

Department of Computer Science, University of California, Santa Barbara, CA, USA

Cite AsGet BibTex

Daniel Lokshtanov, Subhash Suri, and Jie Xue. Efficient Algorithms for Least Square Piecewise Polynomial Regression. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 63:1-63:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ESA.2021.63

Abstract

We present approximation and exact algorithms for piecewise regression of univariate and bivariate data using fixed-degree polynomials. Specifically, given a set S of n data points (𝐱₁, y₁),… , (𝐱_n, y_n) ∈ ℝ^d × ℝ where d ∈ {1,2}, the goal is to segment 𝐱_i’s into some (arbitrary) number of disjoint pieces P₁, … , P_k, where each piece P_j is associated with a fixed-degree polynomial f_j: ℝ^d → ℝ, to minimize the total loss function λ k + ∑_{i = 1}ⁿ (y_i - f(𝐱_i))², where λ ≥ 0 is a regularization term that penalizes model complexity (number of pieces) and f: ⨆_{j = 1}^k P_j → ℝ is the piecewise polynomial function defined as f|_{P_j} = f_j. The pieces P₁, … , P_k are disjoint intervals of ℝ in the case of univariate data and disjoint axis-aligned rectangles in the case of bivariate data. Our error approximation allows use of any fixed-degree polynomial, not just linear functions. Our main results are the following. For univariate data, we present a (1 + ε)-approximation algorithm with time complexity O(n/(ε) log 1/(ε)), assuming that data is presented in sorted order of x_i’s. For bivariate data, we present three results: a sub-exponential exact algorithm with running time n^{O(√n)}; a polynomial-time constant-approximation algorithm; and a quasi-polynomial time approximation scheme (QPTAS). The bivariate case is believed to be NP-hard in the folklore but we could not find a published record in the literature, so in this paper we also present a hardness proof for completeness.

Subject Classification

ACM Subject Classification

Theory of computation → Computational geometry

Keywords

regression analysis
piecewise polynomial
least square error

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Jayadev Acharya, Ilias Diakonikolas, Chinmay Hegde, Jerry Zheng Li, and Ludwig Schmidt. Fast and near-optimal algorithms for approximating distributions by histograms. In Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 249-263, 2015.
Jayadev Acharya, Ilias Diakonikolas, Jerry Li, and Ludwig Schmidt. Fast algorithms for segmented regression. In International Conference on Machine Learning, pages 2878-2886, 2016.
Anna Adamaszek, Sariel Har-Peled, and Andreas Wiese. Approximation schemes for independent set and sparse subsets of polygons. Journal of the ACM (JACM), 66(4):1-40, 2019.
Anna Adamaszek and Andreas Wiese. Approximation schemes for maximum weight independent set of rectangles. In 2013 IEEE 54th annual symposium on foundations of computer science, pages 400-409. IEEE, 2013.
Anna Adamaszek and Andreas Wiese. A qptas for maximum weight independent set of polygons with polylogarithmically many vertices. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 645-656. SIAM, 2014.
Pankaj K Agarwal, Sariel Har-Peled, Nabil H Mustafa, and Yusu Wang. Near-linear time approximation algorithms for curve simplification. Algorithmica, 42(3-4):203-219, 2005.
Pankaj K Agarwal and Subhash Suri. Surface approximation and geometric partitions. SIAM Journal on Computing, 27(4):1016-1035, 1998.
Boris Aronov, Tetsuo Asano, Naoki Katoh, Kurt Mehlhorn, and Takeshi Tokuyama. Polyline fitting of planar points under min-sum criteria. International journal of computational geometry & applications, 16(02n03):97-116, 2006.
Ilias Diakonikolas, Jerry Li, and Anastasia Voloshinov. Efficient algorithms for multidimensional segmented regression. arXiv preprint arXiv:2003.11086, 2020.
Michael T Goodrich. Efficient piecewise-linear function approximation using the uniform metric: (preliminary version). In Proceedings of the tenth annual Symposium on Computational geometry, pages 322-331, 1994.
Sudipto Guha. On the space-time of optimal, approximate and streaming algorithms for synopsis construction problems. The VLDB Journal, 17(6):1509-1535, 2008.
S Louis Hakimi and Edward F Schmeichel. Fitting polygonal functions to a set of points in the plane. CVGIP: Graphical Models and Image Processing, 53(2):132-136, 1991.
Sariel Har-Peled. Quasi-polynomial time approximation scheme for sparse subsets of polygons. In Proceedings of the thirtieth annual symposium on Computational geometry, pages 120-129, 2014.
Hosagrahar Visvesvaraya Jagadish, Nick Koudas, S Muthukrishnan, Viswanath Poosala, Kenneth C Sevcik, and Torsten Suel. Optimal histograms with quality guarantees. In VLDB, volume 98, pages 24-27, 1998.
Jon Kleinberg and Eva Tardos. Algorithm design. Pearson Education India, 2006.
Frederick Mosteller, John Wilder Tukey, et al. Data analysis and regression: a second course in statistics. Pearson, 1977.
DP Wang, NF Huang, HS Chao, and Richard CT Lee. Plane sweep algorithms for the polygonal approximation problems with applications. In International Symposium on Algorithms and Computation, pages 515-522. Springer, 1993.