We study the problem of robust multivariate polynomial regression: let p: ℝⁿ → ℝ be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (𝐱_i,y_i) ∈ [-1,1]ⁿ × ℝ that are noisy versions of (𝐱_i,p(𝐱_i)). More precisely, each 𝐱_i is sampled independently from some distribution χ on [-1,1]ⁿ, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most ρ < 1/2, and otherwise satisfies |y_i-p(𝐱_i)| ≤ σ. The goal is to output a polynomial p̂, of degree at most d in each variable, within an 𝓁_∞-distance of at most O(σ) from p. Kane, Karmalkar, and Price [FOCS'17] solved this problem for n = 1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(dⁿlog d), where the hidden constant depends on n, if χ is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^{2n}log d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(σ), and the run-time depends on log(1/σ). In the setting where each 𝐱_i and y_i are known up to N bits of precision, the run-time’s dependence on N is linear. We also show that our sample complexities are optimal in terms of dⁿ. Furthermore, we show that it is possible to have the run-time be independent of 1/σ, at the cost of a higher sample complexity.
@InProceedings{arora_et_al:LIPIcs.ESA.2024.12, author = {Arora, Vipul and Bhattacharyya, Arnab and Boban, Mathews and Guruswami, Venkatesan and Kelman, Esty}, title = {{Outlier Robust Multivariate Polynomial Regression}}, booktitle = {32nd Annual European Symposium on Algorithms (ESA 2024)}, pages = {12:1--12:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-338-6}, ISSN = {1868-8969}, year = {2024}, volume = {308}, editor = {Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.12}, URN = {urn:nbn:de:0030-drops-210830}, doi = {10.4230/LIPIcs.ESA.2024.12}, annote = {Keywords: Robust Statistics, Polynomial Regression, Sample Efficient Learning} }
Feedback for Dagstuhl Publishing