Search Results

Documents authored by Szörényi, Balázs


Document
Unlabeled Data Does Provably Help

Authors: Malte Darnstädt, Hans Ulrich Simon, and Balázs Szörényi

Published in: LIPIcs, Volume 20, 30th International Symposium on Theoretical Aspects of Computer Science (STACS 2013)


Abstract
A fully supervised learner needs access to correctly labeled examples whereas a semi-supervised learner has access to examples part of which are labeled and part of which are not. The hope is that a large collection of unlabeled examples significantly reduces the need for labeled-ones. It is widely believed that this reduction of "label complexity" is marginal unless the hidden target concept and the domain distribution satisfy some "compatibility assumptions". There are some recent papers in support of this belief. In this paper, we revitalize the discussion by presenting a result that goes in the other direction. To this end, we consider the PAC-learning model in two settings: the (classical) fully supervised setting and the semi-supervised setting. We show that the "label-complexity gap"' between the semi-supervised and the fully supervised setting can become arbitrarily large for concept classes of infinite VC-dimension (or sequences of classes whose VC-dimensions are finite but become arbitrarily large). On the other hand, this gap is bounded by O(ln |C|) for each finite concept class C that contains the constant zero- and the constant one-function. A similar statement holds for all classes C of finite VC-dimension.

Cite as

Malte Darnstädt, Hans Ulrich Simon, and Balázs Szörényi. Unlabeled Data Does Provably Help. In 30th International Symposium on Theoretical Aspects of Computer Science (STACS 2013). Leibniz International Proceedings in Informatics (LIPIcs), Volume 20, pp. 185-196, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)


Copy BibTex To Clipboard

@InProceedings{darnstadt_et_al:LIPIcs.STACS.2013.185,
  author =	{Darnst\"{a}dt, Malte and Simon, Hans Ulrich and Sz\"{o}r\'{e}nyi, Bal\'{a}zs},
  title =	{{Unlabeled Data Does Provably Help}},
  booktitle =	{30th International Symposium on Theoretical Aspects of Computer Science (STACS 2013)},
  pages =	{185--196},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-50-7},
  ISSN =	{1868-8969},
  year =	{2013},
  volume =	{20},
  editor =	{Portier, Natacha and Wilke, Thomas},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2013.185},
  URN =		{urn:nbn:de:0030-drops-39337},
  doi =		{10.4230/LIPIcs.STACS.2013.185},
  annote =	{Keywords: algorithmic learning, sample complexity, semi-supervised learning}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail