Learning in  Reactive Environments with Arbitrary Dependence

Ryabko, Daniil; Hutter, Marcus

doi:10.4230/DagSemProc.06051.8

Document

Learning in Reactive Environments with Arbitrary Dependence

Authors Daniil Ryabko, Marcus Hutter

Part of: Volume: Dagstuhl Seminar Proceedings, Volume 6051
Part of: Series: Dagstuhl Seminar Proceedings (DagSemProc)
License: Creative Commons Attribution 4.0 International license
Publication Date: 2006-07-31

PDF

File

PDF

DagSemProc.06051.8.pdf

Filesize: 216 kB
15 pages

Document Identifiers

DOI: 10.4230/DagSemProc.06051.8
URN: urn:nbn:de:0030-drops-6372

Subject Classification

Keywords

Reinforcement learning
asymptotic average value
self-optimizing policies
(non) Markov decision processes

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

Abstract

In reinforcement learning the task
for an agent is to attain the  best possible asymptotic reward
where the true generating environment is unknown but belongs to a
known countable family of environments.
This task generalises the sequence prediction problem, in which
the environment does not react to the behaviour of the agent.
Solomonoff induction solves the sequence prediction problem
for any countable class of measures; however, it is easy to see
that such result is impossible for reinforcement learning - not any
countable class of environments can be learnt.
We find some sufficient conditions 
on the class of  environments under
which an agent exists which attains the best asymptotic reward
for any environment in the class. We analyze how tight these conditions are and how they
relate to different probabilistic assumptions known in
reinforcement learning and related fields, such as Markov
Decision Processes and mixing conditions.

Cite As Get BibTex

Daniil Ryabko and Marcus Hutter. Learning in Reactive Environments with Arbitrary Dependence. In Kolmogorov Complexity and Applications. Dagstuhl Seminar Proceedings, Volume 6051, pp. 1-15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006) https://doi.org/10.4230/DagSemProc.06051.8

Author Details

Daniil Ryabko

Marcus Hutter

Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail