DROPS

Document

Reinforcement Learning (Dagstuhl Seminar 13321)

Authors: Peter Auer, Marcus Hutter, and Laurent Orseau

Published in: Dagstuhl Reports, Volume 3, Issue 8 (2013)

Abstract

This Dagstuhl Seminar also stood as the 11th European Workshop on Reinforcement Learning (EWRL11). Reinforcement learning gains more and more attention each year, as can be seen at the various conferences (ECML, ICML, IJCAI, ...). EWRL, and in particular this Dagstuhl Seminar, aimed at gathering people interested in reinforcement learning from all around the globe. This unusual format for EWRL helped viewing the field and discussing topics differently.

Cite as

Peter Auer, Marcus Hutter, and Laurent Orseau. Reinforcement Learning (Dagstuhl Seminar 13321). In Dagstuhl Reports, Volume 3, Issue 8, pp. 1-26, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)

Copy BibTex To Clipboard

@Article{auer_et_al:DagRep.3.8.1,
  author =	{Auer, Peter and Hutter, Marcus and Orseau, Laurent},
  title =	{{Reinforcement Learning (Dagstuhl Seminar 13321)}},
  pages =	{1--26},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2013},
  volume =	{3},
  number =	{8},
  editor =	{Auer, Peter and Hutter, Marcus and Orseau, Laurent},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.3.8.1},
  URN =		{urn:nbn:de:0030-drops-43409},
  doi =		{10.4230/DagRep.3.8.1},
  annote =	{Keywords: Machine Learning, Reinforcement Learning, Markov Decision Processes, Planning}
}

Document

DOI: 10.4230/DagSemProc.06201.6

Sequence prediction for non-stationary processes

Authors: Daniil Ryabko and Marcus Hutter

Published in: Dagstuhl Seminar Proceedings, Volume 6201, Combinatorial and Algorithmic Foundations of Pattern and Association Discovery (2006)

Abstract

We address the problem of sequence prediction for nonstationary stochastic processes. In particular, given two measures on the set of one-way infinite sequences over a finite alphabet, consider the question whether one of the measures predicts the other. We find some conditions on local absolute continuity under which prediction is possible.

Cite as

Daniil Ryabko and Marcus Hutter. Sequence prediction for non-stationary processes. In Combinatorial and Algorithmic Foundations of Pattern and Association Discovery. Dagstuhl Seminar Proceedings, Volume 6201, pp. 1-12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{ryabko_et_al:DagSemProc.06201.6,
  author =	{Ryabko, Daniil and Hutter, Marcus},
  title =	{{Sequence prediction for non-stationary processes}},
  booktitle =	{Combinatorial and Algorithmic Foundations of Pattern and Association Discovery},
  pages =	{1--12},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6201},
  editor =	{Rudolf Ahlswede and Alberto Apostolico and Vladimir I. Levenshtein},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06201.6},
  URN =		{urn:nbn:de:0030-drops-7900},
  doi =		{10.4230/DagSemProc.06201.6},
  annote =	{Keywords: Sequence prediction, probability forecasting, local absolute continuity}
}

Document

DOI: 10.4230/DagSemProc.06051.1

06051 Abstracts Collection – Kolmogorov Complexity and Applications

Authors: Marcus Hutter, Wolfgang Merkle, and Paul M.B. Vitanyi

Published in: Dagstuhl Seminar Proceedings, Volume 6051, Kolmogorov Complexity and Applications (2006)

Abstract

From 29.01.06 to 03.02.06, the Dagstuhl Seminar 06051 ``Kolmogorov Complexity and Applications'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

Cite as

Marcus Hutter, Wolfgang Merkle, and Paul M.B. Vitanyi. 06051 Abstracts Collection – Kolmogorov Complexity and Applications. In Kolmogorov Complexity and Applications. Dagstuhl Seminar Proceedings, Volume 6051, pp. 1-17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{hutter_et_al:DagSemProc.06051.1,
  author =	{Hutter, Marcus and Merkle, Wolfgang and Vitanyi, Paul M.B.},
  title =	{{06051 Abstracts Collection – Kolmogorov Complexity and Applications}},
  booktitle =	{Kolmogorov Complexity and Applications},
  pages =	{1--17},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6051},
  editor =	{Marcus Hutter and Wolfgang Merkle and Paul M.B. Vitanyi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06051.1},
  URN =		{urn:nbn:de:0030-drops-6632},
  doi =		{10.4230/DagSemProc.06051.1},
  annote =	{Keywords: Information theory, Kolmogorov Complexity, effective randomnes, algorithmic probability, recursion theory, computational complexity, machine learning knowledge discovery}
}

Document

DOI: 10.4230/DagSemProc.06051.6

Complexity Monotone in Conditions and Future Prediction Errors

Authors: Alexey Chernov, Marcus Hutter, and Jürgen Schmidhuber

Published in: Dagstuhl Seminar Proceedings, Volume 6051, Kolmogorov Complexity and Applications (2006)

Abstract

We bound the future loss when predicting any (computably) stochastic sequence online. Solomonoff finitely bounded the total deviation of his universal predictor $M$ from the true distribution $mu$ by the algorithmic complexity of $mu$. Here we assume we are at a time $t>1$ and already observed $x=x_1...x_t$. We bound the future prediction performance on $x_{t+1}x_{t+2}...$ by a new variant of algorithmic complexity of $mu$ given $x$, plus the complexity of the randomness deficiency of $x$. The new complexity is monotone in its condition in the sense that this complexity can only decrease if the condition is prolonged. We also briefly discuss potential generalizations to Bayesian model classes and to classification problems.

Cite as

Alexey Chernov, Marcus Hutter, and Jürgen Schmidhuber. Complexity Monotone in Conditions and Future Prediction Errors. In Kolmogorov Complexity and Applications. Dagstuhl Seminar Proceedings, Volume 6051, pp. 1-20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{chernov_et_al:DagSemProc.06051.6,
  author =	{Chernov, Alexey and Hutter, Marcus and Schmidhuber, J\"{u}rgen},
  title =	{{Complexity Monotone in Conditions and Future Prediction Errors}},
  booktitle =	{Kolmogorov Complexity and Applications},
  pages =	{1--20},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6051},
  editor =	{Marcus Hutter and Wolfgang Merkle and Paul M.B. Vitanyi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06051.6},
  URN =		{urn:nbn:de:0030-drops-6327},
  doi =		{10.4230/DagSemProc.06051.6},
  annote =	{Keywords: Kolmogorov complexity, posterior bounds, online sequential prediction, Solomonoff prior, monotone conditional complexity, total error, future loss, ra}
}

Document

DOI: 10.4230/DagSemProc.06051.8

Learning in Reactive Environments with Arbitrary Dependence

Authors: Daniil Ryabko and Marcus Hutter

Published in: Dagstuhl Seminar Proceedings, Volume 6051, Kolmogorov Complexity and Applications (2006)

Abstract

In reinforcement learning the task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. This task generalises the sequence prediction problem, in which the environment does not react to the behaviour of the agent. Solomonoff induction solves the sequence prediction problem for any countable class of measures; however, it is easy to see that such result is impossible for reinforcement learning - not any countable class of environments can be learnt. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.

Cite as

Daniil Ryabko and Marcus Hutter. Learning in Reactive Environments with Arbitrary Dependence. In Kolmogorov Complexity and Applications. Dagstuhl Seminar Proceedings, Volume 6051, pp. 1-15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006)

Copy BibTex To Clipboard

@InProceedings{ryabko_et_al:DagSemProc.06051.8,
  author =	{Ryabko, Daniil and Hutter, Marcus},
  title =	{{Learning in  Reactive Environments with Arbitrary Dependence}},
  booktitle =	{Kolmogorov Complexity and Applications},
  pages =	{1--15},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2006},
  volume =	{6051},
  editor =	{Marcus Hutter and Wolfgang Merkle and Paul M.B. Vitanyi},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.06051.8},
  URN =		{urn:nbn:de:0030-drops-6372},
  doi =		{10.4230/DagSemProc.06051.8},
  annote =	{Keywords: Reinforcement learning, asymptotic average value, self-optimizing policies, (non) Markov decision processes}
}

Search Results

Documents authored by Hutter, Marcus

Reinforcement Learning (Dagstuhl Seminar 13321)

Abstract

Cite as

Sequence prediction for non-stationary processes

Abstract

Cite as

06051 Abstracts Collection – Kolmogorov Complexity and Applications

Abstract

Cite as

Complexity Monotone in Conditions and Future Prediction Errors

Abstract

Cite as

Learning in Reactive Environments with Arbitrary Dependence

Abstract

Cite as