DROPS

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

DOI: 10.4230/LIPIcs.ICALP.2024.133

Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs

Authors: Mohan Dantam and Richard Mayr

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

We consider finite-state Markov decision processes with the combined Energy-MeanPayoff objective. The controller tries to avoid running out of energy while simultaneously attaining a strictly positive mean payoff in a second dimension. We show that finite memory suffices for almost surely winning strategies for the Energy-MeanPayoff objective. This is in contrast to the closely related Energy-Parity objective, where almost surely winning strategies require infinite memory in general. We show that exponential memory is sufficient (even for deterministic strategies) and necessary (even for randomized strategies) for almost surely winning Energy-MeanPayoff. The upper bound holds even if the strictly positive mean payoff part of the objective is generalized to multidimensional strictly positive mean payoff. Finally, it is decidable in pseudo-polynomial time whether an almost surely winning strategy exists.

Cite as

Mohan Dantam and Richard Mayr. Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 133:1-133:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{dantam_et_al:LIPIcs.ICALP.2024.133,
  author =	{Dantam, Mohan and Mayr, Richard},
  title =	{{Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{133:1--133:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.133},
  URN =		{urn:nbn:de:0030-drops-202762},
  doi =		{10.4230/LIPIcs.ICALP.2024.133},
  annote =	{Keywords: Markov decision processes, energy, mean payoff, parity, strategy complexity}
}

Document

Track B: Automata, Logic, Semantics, and Theory of Programming

DOI: 10.4230/LIPIcs.ICALP.2024.156

Verification of Population Protocols with Unordered Data

Authors: Steffen van Bergerem, Roland Guttenberg, Sandra Kiefer, Corto Mascle, Nicolas Waldburger, and Chana Weil-Kennedy

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

Population protocols are a well-studied model of distributed computation in which a group of anonymous finite-state agents communicates via pairwise interactions. Together they decide whether their initial configuration, i. e., the initial distribution of agents in the states, satisfies a property. As an extension in order to express properties of multisets over an infinite data domain, Blondin and Ladouceur (ICALP'23) introduced population protocols with unordered data (PPUD). In PPUD, each agent carries a fixed data value, and the interactions between agents depend on whether their data are equal or not. Blondin and Ladouceur also identified the interesting subclass of immediate observation PPUD (IOPPUD), where in every transition one of the two agents remains passive and does not move, and they characterised its expressive power. We study the decidability and complexity of formally verifying these protocols. The main verification problem for population protocols is well-specification, that is, checking whether the given PPUD computes some function. We show that well-specification is undecidable in general. By contrast, for IOPPUD, we exhibit a large yet natural class of problems, which includes well-specification among other classic problems, and establish that these problems are in ExpSpace. We also provide a lower complexity bound, namely coNExpTime-hardness.

Cite as

Steffen van Bergerem, Roland Guttenberg, Sandra Kiefer, Corto Mascle, Nicolas Waldburger, and Chana Weil-Kennedy. Verification of Population Protocols with Unordered Data. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 156:1-156:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{vanbergerem_et_al:LIPIcs.ICALP.2024.156,
  author =	{van Bergerem, Steffen and Guttenberg, Roland and Kiefer, Sandra and Mascle, Corto and Waldburger, Nicolas and Weil-Kennedy, Chana},
  title =	{{Verification of Population Protocols with Unordered Data}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{156:1--156:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.156},
  URN =		{urn:nbn:de:0030-drops-202993},
  doi =		{10.4230/LIPIcs.ICALP.2024.156},
  annote =	{Keywords: Population protocols, Parameterized verification, Distributed computing, Well-specification}
}

@InProceedings{vanbergerem_et_al:LIPIcs.ICALP.2024.156,
  author =	{van Bergerem, Steffen and Guttenberg, Roland and Kiefer, Sandra and Mascle, Corto and Waldburger, Nicolas and Weil-Kennedy, Chana},
  title =	{{Verification of Population Protocols with Unordered Data}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{156:1--156:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.156},
  URN =		{urn:nbn:de:0030-drops-202993},
  doi =		{10.4230/LIPIcs.ICALP.2024.156},
  annote =	{Keywords: Population protocols, Parameterized verification, Distributed computing, Well-specification}
}

Document

Invited Paper

DOI: 10.4230/LIPIcs.CONCUR.2020.3

Safe Reinforcement Learning Using Probabilistic Shields (Invited Paper)

Authors: Nils Jansen, Bettina Könighofer, Sebastian Junges, Alex Serban, and Roderick Bloem

Published in: LIPIcs, Volume 171, 31st International Conference on Concurrency Theory (CONCUR 2020)

Abstract

This paper concerns the efficient construction of a safety shield for reinforcement learning. We specifically target scenarios that incorporate uncertainty and use Markov decision processes (MDPs) as the underlying model to capture such problems. Reinforcement learning (RL) is a machine learning technique that can determine near-optimal policies in MDPs that may be unknown before exploring the model. However, during exploration, RL is prone to induce behavior that is undesirable or not allowed in safety- or mission-critical contexts. We introduce the concept of a probabilistic shield that enables RL decision-making to adhere to safety constraints with high probability. We employ formal verification to efficiently compute the probabilities of critical decisions within a safety-relevant fragment of the MDP. These results help to realize a shield that, when applied to an RL algorithm, restricts the agent from taking unsafe actions, while optimizing the performance objective. We discuss tradeoffs between sufficient progress in the exploration of the environment and ensuring safety. In our experiments, we demonstrate on the arcade game PAC-MAN and on a case study involving service robots that the learning efficiency increases as the learning needs orders of magnitude fewer episodes.

Cite as

Nils Jansen, Bettina Könighofer, Sebastian Junges, Alex Serban, and Roderick Bloem. Safe Reinforcement Learning Using Probabilistic Shields (Invited Paper). In 31st International Conference on Concurrency Theory (CONCUR 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 171, pp. 3:1-3:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{jansen_et_al:LIPIcs.CONCUR.2020.3,
  author =	{Jansen, Nils and K\"{o}nighofer, Bettina and Junges, Sebastian and Serban, Alex and Bloem, Roderick},
  title =	{{Safe Reinforcement Learning Using Probabilistic Shields}},
  booktitle =	{31st International Conference on Concurrency Theory (CONCUR 2020)},
  pages =	{3:1--3:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-160-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{171},
  editor =	{Konnov, Igor and Kov\'{a}cs, Laura},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2020.3},
  URN =		{urn:nbn:de:0030-drops-128155},
  doi =		{10.4230/LIPIcs.CONCUR.2020.3},
  annote =	{Keywords: Safe Reinforcement Learning, Formal Verification, Safe Exploration, Model Checking, Markov Decision Process}
}

Document

DOI: 10.4230/LIPIcs.OPODIS.2017.32

Synthesis of Distributed Algorithms with Parameterized Threshold Guards

Authors: Marijana Lazic, Igor Konnov, Josef Widder, and Roderick Bloem

Published in: LIPIcs, Volume 95, 21st International Conference on Principles of Distributed Systems (OPODIS 2017)

Abstract

Fault-tolerant distributed algorithms are notoriously hard to get right. In this paper we introduce an automated method that helps in that process: the designer provides specifications (the problem to be solved) and a sketch of a distributed algorithm that keeps arithmetic details unspecified. Our tool then automatically fills the missing parts. Fault-tolerant distributed algorithms are typically parameterized, that is, they are designed to work for any number n of processes and any number t of faults, provided some resilience condition holds; e.g., n > 3t. In this paper we automatically synthesize distributed algorithms that work for all parameter values that satisfy the resilience condition. We focus on threshold- guarded distributed algorithms, where actions are taken only if a sufficiently large number of messages is received, e.g., more than t or n/2. Both expressions can be derived by choosing the right values for the coefficients a, b, and c, in the sketch of a threshold a·n+b·t+c. Our method takes as input a sketch of an asynchronous threshold-based fault-tolerant distributed algorithm — where the guards are missing exact coefficients—and then iteratively picks the values for the coefficients. Our approach combines recent progress in parameterized model checking of distributed algo- rithms with counterexample-guided synthesis. Besides theoretical results on termination of the synthesis procedure, we experimentally evaluate our method and show that it can synthesize sev- eral distributed algorithms from the literature, e.g., Byzantine reliable broadcast and Byzantine one-step consensus. In addition, for several new variations of safety and liveness specifications, our tool generates new distributed algorithms.

Cite as

Marijana Lazic, Igor Konnov, Josef Widder, and Roderick Bloem. Synthesis of Distributed Algorithms with Parameterized Threshold Guards. In 21st International Conference on Principles of Distributed Systems (OPODIS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 95, pp. 32:1-32:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{lazic_et_al:LIPIcs.OPODIS.2017.32,
  author =	{Lazic, Marijana and Konnov, Igor and Widder, Josef and Bloem, Roderick},
  title =	{{Synthesis of Distributed Algorithms with Parameterized Threshold Guards}},
  booktitle =	{21st International Conference on Principles of Distributed Systems (OPODIS 2017)},
  pages =	{32:1--32:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-061-3},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{95},
  editor =	{Aspnes, James and Bessani, Alysson and Felber, Pascal and Leit\~{a}o, Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.OPODIS.2017.32},
  URN =		{urn:nbn:de:0030-drops-86359},
  doi =		{10.4230/LIPIcs.OPODIS.2017.32},
  annote =	{Keywords: fault-tolerant distributed algorithms, byzantine faults, parameterized model checking, program synthesis}
}

4 Search Results for "Bloem, Roderick"

Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs

Abstract

Cite as

Verification of Population Protocols with Unordered Data

Abstract

Cite as

Safe Reinforcement Learning Using Probabilistic Shields (Invited Paper)

Abstract

Cite as

Synthesis of Distributed Algorithms with Parameterized Threshold Guards

Abstract

Cite as