L1 Regression with Lewis Weights Subsampling

Parulekar, Aditya; Parulekar, Advait; Price, Eric

doi:10.4230/LIPIcs.APPROX/RANDOM.2021.49

File

LIPIcs.APPROX-RANDOM.2021.49.pdf

Filesize: 0.76 MB
21 pages

Document Identifiers

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2021.49
URN: urn:nbn:de:0030-drops-147422

Author Details

Aditya Parulekar

Department of Computer Science, University of Texas at Austin, TX, USA

Advait Parulekar

Department of Electrical and Computer Engineering, University of Texas at Austin, TX, USA

Eric Price

Department of Computer Science, University of Texas at Austin, TX, USA

Acknowledgements

The authors wish to thank the anonymous reviewers for several comments that improved the paper.

Cite AsGet BibTex

Aditya Parulekar, Advait Parulekar, and Eric Price. L1 Regression with Lewis Weights Subsampling. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 207, pp. 49:1-49:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.49

Abstract

We consider the problem of finding an approximate solution to 𝓁₁ regression while only observing a small number of labels. Given an n × d unlabeled data matrix X, we must choose a small set of m ≪ n rows to observe the labels of, then output an estimate β̂ whose error on the original problem is within a 1 + ε factor of optimal. We show that sampling from X according to its Lewis weights and outputting the empirical minimizer succeeds with probability 1-δ for m > O(1/(ε²) d log d/(ε δ)). This is analogous to the performance of sampling according to leverage scores for 𝓁₂ regression, but with exponentially better dependence on δ. We also give a corresponding lower bound of Ω(d/(ε²) + (d + 1/(ε²)) log 1/(δ)).

Subject Classification

ACM Subject Classification

Computing methodologies → Active learning settings

Keywords

Active regression
Lewis weights

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Xue Chen and Michał Dereziński. Query complexity of least absolute deviation regression via robust uniform convergence. arXiv preprint arXiv:2102.02322, 2021.
Xue Chen and Eric Price. Active regression via linear-sample sparsification. In Alina Beygelzimer and Daniel Hsu, editors, Proceedings of the Thirty-Second Conference on Learning Theory, volume 99 of Proceedings of Machine Learning Research, pages 663-695, Phoenix, USA, June 25-28 2019. PMLR. URL: http://proceedings.mlr.press/v99/chen19a.html.
Kenneth L. Clarkson. Subgradient and sampling algorithms for l1 regression. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '05, page 257–266, USA, 2005. Society for Industrial and Applied Mathematics.
Michael B. Cohen and Richard Peng. Lp row sampling by lewis weights. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC '15, page 183–192, New York, NY, USA, 2015. Association for Computing Machinery. URL: https://doi.org/10.1145/2746539.2746567.
Anirban Dasgupta, Petros Drineas, Boulos Harb, Ravi Kumar, and Michael W Mahoney. Sampling algorithms and coresets for $$1ell_p regression. SIAM Journal on Computing, 38(5):2060-2078, 2009.
Michał Derezinski and Michael W Mahoney. Determinantal point processes in randomized numerical linear algebra. Notices of the American Mathematical Society, 68(1), 2021.
Michał Dereziński and Manfred K Warmuth. Unbiased estimates for linear regression via volume sampling. arXiv preprint arXiv:1705.06908, 2017.
Petros Drineas, Michael W Mahoney, and Shan Muthukrishnan. Sampling algorithms for l 2 regression and applications. In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pages 1127-1136, 2006.
David Durfee, Kevin A. Lai, and Saurabh Sawlani. 𝓁₁ regression using lewis weights preconditioning and stochastic gradient descent. In Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet, editors, Proceedings of the 31st Conference On Learning Theory, volume 75 of Proceedings of Machine Learning Research, pages 1626-1656. PMLR, July 06-09 2018. URL: http://proceedings.mlr.press/v75/durfee18a.html.
E. N. Gilbert. A comparison of signalling alphabets. The Bell System Technical Journal, 31(3):504-522, 1952. URL: https://doi.org/10.1002/j.1538-7305.1952.tb01393.x.
T. Lattimore and C. Szepesvari. Bandit Algorithms. Cambridge University Press, 2020.
M. Ledoux and M. Talagrand. Comparison theorems, random geometry and some limit theorems for empirical processes. Ann. Probab., 17(2):596-631, April 1989. URL: https://doi.org/10.1214/aop/1176991418.
Michael W. Mahoney. Randomized algorithms for matrices and data. Found. Trends Mach. Learn., 3(2):123–224, 2011. URL: https://doi.org/10.1561/2200000035.
Aditya Parulekar, Advait Parulekar, and Eric Price. L1 regression with lewis weights subsampling. CoRR, abs/2105.09433, 2021. URL: http://arxiv.org/abs/2105.09433.
Michel Talagrand. Embedding subspaces of l1 into ln 1. Proceedings of the American Mathematical Society, 108(2):363-369, 1990. URL: http://www.jstor.org/stable/2048283.
Michel Talagrand. Embedding subspaces of lp in lpn. In J. Lindenstrauss and V. Milman, editors, Geometric Aspects of Functional Analysis, pages 311-326, Basel, 1995. Birkhäuser Basel.
David P. Woodruff. Sketching as a tool for numerical linear algebra. Found. Trends Theor. Comput. Sci., 10(1–2):1–157, 2014. URL: https://doi.org/10.1561/0400000060.