De, Anindya ;
Long, Philip M. ;
Servedio, Rocco A.
Density Estimation for ShiftInvariant Multidimensional Distributions
Abstract
We study density estimation for classes of shiftinvariant distributions over R^d. A multidimensional distribution is "shiftinvariant" if, roughly speaking, it is close in total variation distance to a small shift of it in any direction. Shiftinvariance relaxes smoothness assumptions commonly used in nonparametric density estimation to allow jump discontinuities. The different classes of distributions that we consider correspond to different rates of tail decay.
For each such class we give an efficient algorithm that learns any distribution in the class from independent samples with respect to total variation distance. As a special case of our general result, we show that ddimensional shiftinvariant distributions which satisfy an exponential tail bound can be learned to total variation distance error epsilon using O~_d(1/ epsilon^{d+2}) examples and O~_d(1/ epsilon^{2d+2}) time. This implies that, for constant d, multivariate logconcave distributions can be learned in O~_d(1/epsilon^{2d+2}) time using O~_d(1/epsilon^{d+2}) samples, answering a question of [Diakonikolas et al., 2016]. All of our results extend to a model of noisetolerant density estimation using Huber's contamination model, in which the target distribution to be learned is a (1epsilon,epsilon) mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error O(epsilon) from the target distribution. We show that our general results are close to best possible by proving a simple Omega (1/epsilon^d) informationtheoretic lower bound on sample complexity even for learning bounded distributions that are shiftinvariant.
BibTeX  Entry
@InProceedings{de_et_al:LIPIcs:2018:10121,
author = {Anindya De and Philip M. Long and Rocco A. Servedio},
title = {{Density Estimation for ShiftInvariant Multidimensional Distributions}},
booktitle = {10th Innovations in Theoretical Computer Science Conference (ITCS 2019)},
pages = {28:128:20},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959770958},
ISSN = {18688969},
year = {2018},
volume = {124},
editor = {Avrim Blum},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2018/10121},
URN = {urn:nbn:de:0030drops101214},
doi = {10.4230/LIPIcs.ITCS.2019.28},
annote = {Keywords: Density estimation, unsupervised learning, logconcave distributions, nonparametrics}
}
08.01.2019
Keywords: 

Density estimation, unsupervised learning, logconcave distributions, nonparametrics 
Seminar: 

10th Innovations in Theoretical Computer Science Conference (ITCS 2019)

Issue date: 

2018 
Date of publication: 

08.01.2019 