Monotone Probability Distributions over the Boolean Cube Can Be Learned with Sublinear Samples

eng Schloss Dagstuhl – Leibniz-Zentrum für Informatik Leibniz International Proceedings in Informatics 1868-8969 2020-01-06 28:1 28:34 10.4230/LIPIcs.ITCS.2020.28 article Monotone Probability Distributions over the Boolean Cube Can Be Learned with Sublinear Samples Rubinfeld, Ronitt 1 2 Vasilyan, Arsen 1 CSAIL at MIT, Cambridge, MA, USA Blavatnik School of Computer Science at Tel Aviv University, Israel A probability distribution over the Boolean cube is monotone if flipping the value of a coordinate from zero to one can only increase the probability of an element. Given samples of an unknown monotone distribution over the Boolean cube, we give (to our knowledge) the first algorithm that learns an approximation of the distribution in statistical distance using a number of samples that is sublinear in the domain. To do this, we develop a structural lemma describing monotone probability distributions. The structural lemma has further implications to the sample complexity of basic testing tasks for analyzing monotone probability distributions over the Boolean cube: We use it to give nontrivial upper bounds on the tasks of estimating the distance of a monotone distribution to uniform and of estimating the support size of a monotone distribution. In the setting of monotone probability distributions over the Boolean cube, our algorithms are the first to have sample complexity lower than known lower bounds for the same testing tasks on arbitrary (not necessarily monotone) probability distributions. One further consequence of our learning algorithm is an improved sample complexity for the task of testing whether a distribution on the Boolean cube is monotone. https://drops.dagstuhl.de/storage/00lipics/lipics-vol151-itcs2020/LIPIcs.ITCS.2020.28/LIPIcs.ITCS.2020.28.pdf Learning distributions monotone probability distributions estimating support size

<publisher>Schloss Dagstuhl – Leibniz-Zentrum für Informatik</publisher>

<journalTitle>Leibniz International Proceedings in Informatics</journalTitle>

<doi>10.4230/LIPIcs.ITCS.2020.28</doi>

<documentType>article</documentType>

<title language="eng">Monotone Probability Distributions over the Boolean Cube Can Be Learned with Sublinear Samples</title>

<name>Rubinfeld, Ronitt</name>

</author>

<name>Vasilyan, Arsen</name>

</author>

</authors>

<affiliationName affiliationId="1">CSAIL at MIT, Cambridge, MA, USA</affiliationName>

<affiliationName affiliationId="2">Blavatnik School of Computer Science at Tel Aviv University, Israel</affiliationName>

</affiliationsList>

<abstract language="eng">A probability distribution over the Boolean cube is monotone if flipping the value of a coordinate from zero to one can only increase the probability of an element. Given samples of an unknown monotone distribution over the Boolean cube, we give (to our knowledge) the first algorithm that learns an approximation of the distribution in statistical distance using a number of samples that is sublinear in the domain. To do this, we develop a structural lemma describing monotone probability distributions. The structural lemma has further implications to the sample complexity of basic testing tasks for analyzing monotone probability distributions over the Boolean cube: We use it to give nontrivial upper bounds on the tasks of estimating the distance of a monotone distribution to uniform and of estimating the support size of a monotone distribution. In the setting of monotone probability distributions over the Boolean cube, our algorithms are the first to have sample complexity lower than known lower bounds for the same testing tasks on arbitrary (not necessarily monotone) probability distributions. One further consequence of our learning algorithm is an improved sample complexity for the task of testing whether a distribution on the Boolean cube is monotone.</abstract>

<fullTextUrl format="pdf">https://drops.dagstuhl.de/storage/00lipics/lipics-vol151-itcs2020/LIPIcs.ITCS.2020.28/LIPIcs.ITCS.2020.28.pdf</fullTextUrl>

<keyword>Learning distributions</keyword>

<keyword>monotone probability distributions</keyword>

<keyword>estimating support size</keyword>

</keywords>

</record>

</records>