Efficient Algorithms for Least Square Piecewise Polynomial Regression
We present approximation and exact algorithms for piecewise regression of univariate and bivariate data using fixed-degree polynomials. Specifically, given a set S of n data points (š±ā, yā),ā¦ , (š±_n, y_n) ā ā^d Ć ā where d ā {1,2}, the goal is to segment š±_iās into some (arbitrary) number of disjoint pieces Pā, ā¦ , P_k, where each piece P_j is associated with a fixed-degree polynomial f_j: ā^d ā ā, to minimize the total loss function Ī» k + ā_{i = 1}āæ (y_i - f(š±_i))Ā², where Ī» ā„ 0 is a regularization term that penalizes model complexity (number of pieces) and f: āØ_{j = 1}^k P_j ā ā is the piecewise polynomial function defined as f|_{P_j} = f_j. The pieces Pā, ā¦ , P_k are disjoint intervals of ā in the case of univariate data and disjoint axis-aligned rectangles in the case of bivariate data. Our error approximation allows use of any fixed-degree polynomial, not just linear functions.
Our main results are the following. For univariate data, we present a (1 + Īµ)-approximation algorithm with time complexity O(n/(Īµ) log 1/(Īµ)), assuming that data is presented in sorted order of x_iās. For bivariate data, we present three results: a sub-exponential exact algorithm with running time n^{O(ān)}; a polynomial-time constant-approximation algorithm; and a quasi-polynomial time approximation scheme (QPTAS). The bivariate case is believed to be NP-hard in the folklore but we could not find a published record in the literature, so in this paper we also present a hardness proof for completeness.
regression analysis
piecewise polynomial
least square error
Theory of computation~Computational geometry
63:1-63:15
Regular Paper
Daniel
Lokshtanov
Daniel Lokshtanov
Department of Computer Science, University of California, Santa Barbara, CA, USA
Subhash
Suri
Subhash Suri
Department of Computer Science, University of California, Santa Barbara, CA, USA
Jie
Xue
Jie Xue
Department of Computer Science, University of California, Santa Barbara, CA, USA
10.4230/LIPIcs.ESA.2021.63
Jayadev Acharya, Ilias Diakonikolas, Chinmay Hegde, Jerry Zheng Li, and Ludwig Schmidt. Fast and near-optimal algorithms for approximating distributions by histograms. In Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 249-263, 2015.
Jayadev Acharya, Ilias Diakonikolas, Jerry Li, and Ludwig Schmidt. Fast algorithms for segmented regression. In International Conference on Machine Learning, pages 2878-2886, 2016.
Anna Adamaszek, Sariel Har-Peled, and Andreas Wiese. Approximation schemes for independent set and sparse subsets of polygons. Journal of the ACM (JACM), 66(4):1-40, 2019.
Anna Adamaszek and Andreas Wiese. Approximation schemes for maximum weight independent set of rectangles. In 2013 IEEE 54th annual symposium on foundations of computer science, pages 400-409. IEEE, 2013.
Anna Adamaszek and Andreas Wiese. A qptas for maximum weight independent set of polygons with polylogarithmically many vertices. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 645-656. SIAM, 2014.
Pankaj K Agarwal, Sariel Har-Peled, Nabil H Mustafa, and Yusu Wang. Near-linear time approximation algorithms for curve simplification. Algorithmica, 42(3-4):203-219, 2005.
Pankaj K Agarwal and Subhash Suri. Surface approximation and geometric partitions. SIAM Journal on Computing, 27(4):1016-1035, 1998.
Boris Aronov, Tetsuo Asano, Naoki Katoh, Kurt Mehlhorn, and Takeshi Tokuyama. Polyline fitting of planar points under min-sum criteria. International journal of computational geometry & applications, 16(02n03):97-116, 2006.
Ilias Diakonikolas, Jerry Li, and Anastasia Voloshinov. Efficient algorithms for multidimensional segmented regression. arXiv preprint arXiv:2003.11086, 2020.
Michael T Goodrich. Efficient piecewise-linear function approximation using the uniform metric: (preliminary version). In Proceedings of the tenth annual Symposium on Computational geometry, pages 322-331, 1994.
Sudipto Guha. On the space-time of optimal, approximate and streaming algorithms for synopsis construction problems. The VLDB Journal, 17(6):1509-1535, 2008.
S Louis Hakimi and Edward F Schmeichel. Fitting polygonal functions to a set of points in the plane. CVGIP: Graphical Models and Image Processing, 53(2):132-136, 1991.
Sariel Har-Peled. Quasi-polynomial time approximation scheme for sparse subsets of polygons. In Proceedings of the thirtieth annual symposium on Computational geometry, pages 120-129, 2014.
Hosagrahar Visvesvaraya Jagadish, Nick Koudas, S Muthukrishnan, Viswanath Poosala, Kenneth C Sevcik, and Torsten Suel. Optimal histograms with quality guarantees. In VLDB, volume 98, pages 24-27, 1998.
Jon Kleinberg and Eva Tardos. Algorithm design. Pearson Education India, 2006.
Frederick Mosteller, John Wilder Tukey, et al. Data analysis and regression: a second course in statistics. Pearson, 1977.
DP Wang, NF Huang, HS Chao, and Richard CT Lee. Plane sweep algorithms for the polygonal approximation problems with applications. In International Symposium on Algorithms and Computation, pages 515-522. Springer, 1993.
Daniel Lokshtanov, Subhash Suri, and Jie Xue
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode