DROPS

Document

DOI: 10.4230/LIPIcs.CONCUR.2023.35

Deciding What Is Good-For-MDPs

Authors: Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova

Published in: LIPIcs, Volume 279, 34th International Conference on Concurrency Theory (CONCUR 2023)

Abstract

Nondeterministic good-for-MDPs (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement learning, where the simpler Büchi acceptance conditions it allows to use is key. However, while there are classic and novel techniques to obtain automata that are GFM, there has not been a decision procedure for checking whether or not an automaton is GFM. We show that GFM-ness is decidable and provide an EXPTIME decision procedure as well as a PSPACE-hardness proof.

Cite as

Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova. Deciding What Is Good-For-MDPs. In 34th International Conference on Concurrency Theory (CONCUR 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 279, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{schewe_et_al:LIPIcs.CONCUR.2023.35,
  author =	{Schewe, Sven and Tang, Qiyi and Zhanabekova, Tansholpan},
  title =	{{Deciding What Is Good-For-MDPs}},
  booktitle =	{34th International Conference on Concurrency Theory (CONCUR 2023)},
  pages =	{35:1--35:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-299-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{279},
  editor =	{P\'{e}rez, Guillermo A. and Raskin, Jean-Fran\c{c}ois},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2023.35},
  URN =		{urn:nbn:de:0030-drops-190290},
  doi =		{10.4230/LIPIcs.CONCUR.2023.35},
  annote =	{Keywords: B\"{u}chi automata, Markov Decision Processes, Omega-regular objectives, Reinforcement learning}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2022.32

Strategies for MDP Bisimilarity Equivalence and Inequivalence

Authors: Stefan Kiefer and Qiyi Tang

Published in: LIPIcs, Volume 243, 33rd International Conference on Concurrency Theory (CONCUR 2022)

Abstract

A labelled Markov decision process (MDP) is a labelled Markov chain with nondeterminism; i.e., together with a strategy a labelled MDP induces a labelled Markov chain. Motivated by applications to the verification of probabilistic noninterference in security, we study problems whether there exist strategies such that the labelled MDPs become bisimilarity equivalent/inequivalent. We show that the equivalence problem is decidable; in fact, it is EXPTIME-complete and becomes NP-complete if one of the MDPs is a Markov chain. Concerning the inequivalence problem, we show that (1) it is decidable in polynomial time; (2) if there are strategies for inequivalence then there are memoryless strategies for inequivalence; (3) such memoryless strategies can be computed in polynomial time.

Cite as

Stefan Kiefer and Qiyi Tang. Strategies for MDP Bisimilarity Equivalence and Inequivalence. In 33rd International Conference on Concurrency Theory (CONCUR 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 243, pp. 32:1-32:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@InProceedings{kiefer_et_al:LIPIcs.CONCUR.2022.32,
  author =	{Kiefer, Stefan and Tang, Qiyi},
  title =	{{Strategies for MDP Bisimilarity Equivalence and Inequivalence}},
  booktitle =	{33rd International Conference on Concurrency Theory (CONCUR 2022)},
  pages =	{32:1--32:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-246-4},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{243},
  editor =	{Klin, Bartek and Lasota, S{\l}awomir and Muscholl, Anca},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2022.32},
  URN =		{urn:nbn:de:0030-drops-170955},
  doi =		{10.4230/LIPIcs.CONCUR.2022.32},
  annote =	{Keywords: Markov decision processes, Markov chains}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2021.48

Approximate Bisimulation Minimisation

Authors: Stefan Kiefer and Qiyi Tang

Published in: LIPIcs, Volume 213, 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)

Abstract

We propose polynomial-time algorithms to minimise labelled Markov chains whose transition probabilities are not known exactly, have been perturbed, or can only be obtained by sampling. Our algorithms are based on a new notion of an approximate bisimulation quotient, obtained by lumping together states that are exactly bisimilar in a slightly perturbed system. We present experiments that show that our algorithms are able to recover the structure of the bisimulation quotient of the unperturbed system.

Cite as

Stefan Kiefer and Qiyi Tang. Approximate Bisimulation Minimisation. In 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 213, pp. 48:1-48:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{kiefer_et_al:LIPIcs.FSTTCS.2021.48,
  author =	{Kiefer, Stefan and Tang, Qiyi},
  title =	{{Approximate Bisimulation Minimisation}},
  booktitle =	{41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)},
  pages =	{48:1--48:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-215-0},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{213},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Chekuri, Chandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2021.48},
  URN =		{urn:nbn:de:0030-drops-155599},
  doi =		{10.4230/LIPIcs.FSTTCS.2021.48},
  annote =	{Keywords: Markov chains, Behavioural metrics, Bisimulation}
}

Document

DOI: 10.4230/LIPIcs.FSTTCS.2020.49

Comparing Labelled Markov Decision Processes

Authors: Stefan Kiefer and Qiyi Tang

Published in: LIPIcs, Volume 182, 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)

Abstract

A labelled Markov decision process is a labelled Markov chain with nondeterminism, i.e., together with a strategy a labelled MDP induces a labelled Markov chain. The model is related to interval Markov chains. Motivated by applications of equivalence checking for the verification of anonymity, we study the algorithmic comparison of two labelled MDPs, in particular, whether there exist strategies such that the MDPs become equivalent/inequivalent, both in terms of trace equivalence and in terms of probabilistic bisimilarity. We provide the first polynomial-time algorithms for computing memoryless strategies to make the two labelled MDPs inequivalent if such strategies exist. We also study the computational complexity of qualitative problems about making the total variation distance and the probabilistic bisimilarity distance less than one or equal to one.

Cite as

Stefan Kiefer and Qiyi Tang. Comparing Labelled Markov Decision Processes. In 40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 182, pp. 49:1-49:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{kiefer_et_al:LIPIcs.FSTTCS.2020.49,
  author =	{Kiefer, Stefan and Tang, Qiyi},
  title =	{{Comparing Labelled Markov Decision Processes}},
  booktitle =	{40th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2020)},
  pages =	{49:1--49:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-174-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{182},
  editor =	{Saxena, Nitin and Simon, Sunil},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2020.49},
  URN =		{urn:nbn:de:0030-drops-132903},
  doi =		{10.4230/LIPIcs.FSTTCS.2020.49},
  annote =	{Keywords: Markov decision processes, Markov chains, Behavioural metrics}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2019.9

Computing Probabilistic Bisimilarity Distances for Probabilistic Automata

Authors: Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare, Qiyi Tang, and Franck van Breugel

Published in: LIPIcs, Volume 140, 30th International Conference on Concurrency Theory (CONCUR 2019)

Abstract

The probabilistic bisimilarity distance of Deng et al. has been proposed as a robust quantitative generalization of Segala and Lynch’s probabilistic bisimilarity for probabilistic automata. In this paper, we present a novel characterization of the bisimilarity distance as the solution of a simple stochastic game. The characterization gives us an algorithm to compute the distances by applying Condon’s simple policy iteration on these games. The correctness of Condon’s approach, however, relies on the assumption that the games are stopping. Our games may be non-stopping in general, yet we are able to prove termination for this extended class of games. Already other algorithms have been proposed in the literature to compute these distances, with complexity in UP cap coUP and PPAD. Despite the theoretical relevance, these algorithms are inefficient in practice. To the best of our knowledge, our algorithm is the first practical solution. In the proofs of all the above-mentioned results, an alternative presentation of the Hausdorff distance due to Mémoli plays a central rôle.

Cite as

Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu Mardare, Qiyi Tang, and Franck van Breugel. Computing Probabilistic Bisimilarity Distances for Probabilistic Automata. In 30th International Conference on Concurrency Theory (CONCUR 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 140, pp. 9:1-9:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{bacci_et_al:LIPIcs.CONCUR.2019.9,
  author =	{Bacci, Giorgio and Bacci, Giovanni and Larsen, Kim G. and Mardare, Radu and Tang, Qiyi and van Breugel, Franck},
  title =	{{Computing Probabilistic Bisimilarity Distances for Probabilistic Automata}},
  booktitle =	{30th International Conference on Concurrency Theory (CONCUR 2019)},
  pages =	{9:1--9:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-121-4},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{140},
  editor =	{Fokkink, Wan and van Glabbeek, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2019.9},
  URN =		{urn:nbn:de:0030-drops-109119},
  doi =		{10.4230/LIPIcs.CONCUR.2019.9},
  annote =	{Keywords: Probabilistic automata, Behavioural metrics, Simple stochastic games, Simple policy iteration algorithm}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2018.9

Deciding Probabilistic Bisimilarity Distance One for Probabilistic Automata

Authors: Qiyi Tang and Franck van Breugel

Published in: LIPIcs, Volume 118, 29th International Conference on Concurrency Theory (CONCUR 2018)

Abstract

Probabilistic bisimilarity, due to Segala and Lynch, is an equivalence relation that captures which states of a probabilistic automaton behave exactly the same. Deng, Chothia, Palamidessi and Pang proposed a robust quantitative generalization of probabilistic bisimilarity. Their probabilistic bisimilarity distances of states of a probabilistic automaton capture the similarity of their behaviour. The smaller the distance, the more alike the states behave. In particular, states are probabilistic bisimilar if and only if their distance is zero. Although the complexity of computing probabilistic bisimilarity distances for probabilistic automata has already been studied and shown to be in NP cap coNP and PPAD, we are not aware of any practical algorithm to compute those distances. In this paper we provide several key results towards algorithms to compute probabilistic bisimilarity distances for probabilistic automata. In particular, we present a polynomial time algorithm that decides distance one. Furthermore, we give an alternative characterization of the probabilistic bisimilarity distances as a basis for a policy iteration algorithm.

Cite as

Qiyi Tang and Franck van Breugel. Deciding Probabilistic Bisimilarity Distance One for Probabilistic Automata. In 29th International Conference on Concurrency Theory (CONCUR 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 118, pp. 9:1-9:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{tang_et_al:LIPIcs.CONCUR.2018.9,
  author =	{Tang, Qiyi and van Breugel, Franck},
  title =	{{Deciding Probabilistic Bisimilarity Distance One for Probabilistic Automata}},
  booktitle =	{29th International Conference on Concurrency Theory (CONCUR 2018)},
  pages =	{9:1--9:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-087-3},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{118},
  editor =	{Schewe, Sven and Zhang, Lijun},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2018.9},
  URN =		{urn:nbn:de:0030-drops-95472},
  doi =		{10.4230/LIPIcs.CONCUR.2018.9},
  annote =	{Keywords: probabilistic automaton, probabilistic bisimilarity, distance}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2017.27

Algorithms to Compute Probabilistic Bisimilarity Distances for Labelled Markov Chains

Authors: Qiyi Tang and Franck van Breugel

Published in: LIPIcs, Volume 85, 28th International Conference on Concurrency Theory (CONCUR 2017)

Abstract

In the late nineties, Desharnais, Gupta, Jagadeesan and Panangaden presented probabilistic bisimilarity distances on the states of a labelled Markov chain. This provided a quantitative generalisation of probabilistic bisimilarity introduced by Larsen and Skou a decade earlier. In the last decade, several algorithms to approximate and compute these probabilistic bisimilarity distances have been put forward. In this paper, we correct, improve and generalise some of these algorithms. Furthermore, we compare their performance experimentally.

Cite as

Qiyi Tang and Franck van Breugel. Algorithms to Compute Probabilistic Bisimilarity Distances for Labelled Markov Chains. In 28th International Conference on Concurrency Theory (CONCUR 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 85, pp. 27:1-27:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{tang_et_al:LIPIcs.CONCUR.2017.27,
  author =	{Tang, Qiyi and van Breugel, Franck},
  title =	{{Algorithms to Compute Probabilistic Bisimilarity Distances for Labelled Markov Chains}},
  booktitle =	{28th International Conference on Concurrency Theory (CONCUR 2017)},
  pages =	{27:1--27:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-048-4},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{85},
  editor =	{Meyer, Roland and Nestmann, Uwe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2017.27},
  URN =		{urn:nbn:de:0030-drops-77983},
  doi =		{10.4230/LIPIcs.CONCUR.2017.27},
  annote =	{Keywords: labelled Markov chain, probabilistic bisimilarity, pseudometric, policy iteration}
}

Document

DOI: 10.4230/LIPIcs.CONCUR.2016.22

Computing Probabilistic Bisimilarity Distances via Policy Iteration

Authors: Qiyi Tang and Franck van Breugel

Published in: LIPIcs, Volume 59, 27th International Conference on Concurrency Theory (CONCUR 2016)

Abstract

A transformation mapping a labelled Markov chain to a simple stochastic game is presented. In the resulting simple stochastic game, each vertex corresponds to a pair of states of the labelled Markov chain. The value of a vertex of the simple stochastic game is shown to be equal to the probabilistic bisimilarity distance, a notion due to Desharnais, Gupta, Jagadeesan and Panangaden, of the corresponding pair of states of the labelled Markov chain. Bacci, Bacci, Larsen and Mardare introduced an algorithm to compute the probabilistic bisimilarity distances for a labelled Markov chain. A modification of a basic version of their algorithm for a labelled Markov chain is shown to be the policy iteration algorithm applied to the corresponding simple stochastic game. Furthermore, it is shown that this algorithm takes exponential time in the worst case.

Cite as

Qiyi Tang and Franck van Breugel. Computing Probabilistic Bisimilarity Distances via Policy Iteration. In 27th International Conference on Concurrency Theory (CONCUR 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 59, pp. 22:1-22:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{tang_et_al:LIPIcs.CONCUR.2016.22,
  author =	{Tang, Qiyi and van Breugel, Franck},
  title =	{{Computing Probabilistic Bisimilarity Distances via Policy Iteration}},
  booktitle =	{27th International Conference on Concurrency Theory (CONCUR 2016)},
  pages =	{22:1--22:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-017-0},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{59},
  editor =	{Desharnais, Jos\'{e}e and Jagadeesan, Radha},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2016.22},
  URN =		{urn:nbn:de:0030-drops-61837},
  doi =		{10.4230/LIPIcs.CONCUR.2016.22},
  annote =	{Keywords: labelled Markov chain, simple stochastic game, probabilistic bisimilarity, pseudometric, value function, policy iteration}
}

Search Results

Documents authored by Tang, Qiyi

Deciding What Is Good-For-MDPs

Abstract

Cite as

Strategies for MDP Bisimilarity Equivalence and Inequivalence

Abstract

Cite as

Approximate Bisimulation Minimisation

Abstract

Cite as

Comparing Labelled Markov Decision Processes

Abstract

Cite as

Computing Probabilistic Bisimilarity Distances for Probabilistic Automata

Abstract

Cite as

Deciding Probabilistic Bisimilarity Distance One for Probabilistic Automata

Abstract

Cite as

Algorithms to Compute Probabilistic Bisimilarity Distances for Labelled Markov Chains

Abstract

Cite as

Computing Probabilistic Bisimilarity Distances via Policy Iteration

Abstract

Cite as