Synthesis of Safe, Optimal and Compact Strategies for Stochastic Hybrid Games (Invited Paper)

Larsen, Kim G.

doi:10.4230/LIPIcs.CONCUR.2019.2

File

Cite AsGet BibTex

Kim G. Larsen. Synthesis of Safe, Optimal and Compact Strategies for Stochastic Hybrid Games (Invited Paper). In 30th International Conference on Concurrency Theory (CONCUR 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 140, pp. 2:1-2:5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CONCUR.2019.2

Abstract

UPPAAL-Stratego is a recent branch of the verification tool UPPAAL allowing for synthesis of safe and optimal strategies for stochastic timed (hybrid) games. We describe newly developed learning methods, allowing for synthesis of significantly better strategies and with much improved convergence behaviour. Also, we describe novel use of decision trees for learning orders-of-magnitude more compact strategy representation. In both cases, the seek for optimality does not compromise safety.

Subject Classification

ACM Subject Classification

Theory of computation → Timed and hybrid models
Theory of computation → Verification by model checking
Theory of computation → Online learning theory
Theory of computation → Reinforcement learning

Keywords

Timed automata
Stochastic hybrid grame
Symbolic synthesis
Reinforcement learning
Q-learning
M-learning

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Pranav Ashok, Jan Kretinsky, Kim Guldstrand Larsen, Adrien Le Coent, Jakob Haahr Taankvist, and Maximilian Weininger. SOS: Safe, Optimal and Small Strategies for Stochastic Hybrid Games. In To appear in Proceedings of QEST, 2019.
Andrew G. Barto, Steven J. Bradtke, and Satinder P. Singh. Learning to Act Using Real-time Dynamic Programming. Artif. Intell., 72(1-2):81-138, January 1995. URL: https://doi.org/10.1016/0004-3702(94)00011-O.
Gerd Behrmann, Agnès Cougnard, Alexandre David, Emmanuel Fleury, Kim Guldstrand Larsen, and Didier Lime. UPPAAL-Tiga: Time for Playing Games! In Werner Damm and Holger Hermanns, editors, Computer Aided Verification, 19th International Conference, CAV 2007, Berlin, Germany, July 3-7, 2007, Proceedings, volume 4590 of Lecture Notes in Computer Science, pages 121-125. Springer, 2007. URL: https://doi.org/10.1007/978-3-540-73368-3_14.
Adrien Le Coënt, Laurent Fribourg, Nicolas Markey, Florian De Vuyst, and Ludovic Chamoin. Compositional synthesis of state-dependent switching control. Theor. Comput. Sci., 750:53-68, 2018. URL: https://doi.org/10.1016/j.tcs.2018.01.021.
Alexandre David, Peter Gjøl Jensen, Kim Guldstrand Larsen, Axel Legay, Didier Lime, Mathias Grund Sørensen, and Jakob Haahr Taankvist. On Time with Minimal Expected Cost! In Franck Cassez and Jean-François Raskin, editors, Automated Technology for Verification and Analysis - 12th International Symposium, ATVA 2014, Sydney, NSW, Australia, November 3-7, 2014, Proceedings, volume 8837 of Lecture Notes in Computer Science, pages 129-145. Springer, 2014. URL: https://doi.org/10.1007/978-3-319-11936-6_10.
Alexandre David, Peter Gjøl Jensen, Kim Guldstrand Larsen, Marius Mikucionis, and Jakob Haahr Taankvist. Uppaal Stratego. In Christel Baier and Cesare Tinelli, editors, Tools and Algorithms for the Construction and Analysis of Systems - 21st International Conference, TACAS 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceedings, volume 9035 of Lecture Notes in Computer Science, pages 206-211. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-46681-0_16.
Andreas Berre Eriksen, Harry Lahrmann, and Kim Guldstrand Larsen andJakob Haahr Taankvist. Controlling Signalized Intersections using Machine Learning. In World Conference on Transport Research, 2019.
David Henriques, João Martins, Paolo Zuliani, André Platzer, and Edmund M. Clarke. Statistical Model Checking for Markov Decision Processes. In Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012, London, United Kingdom, September 17-20, 2012, pages 84-93. IEEE Computer Society, 2012. URL: https://doi.org/10.1109/QEST.2012.19.
Manfred Jaeger, Peter G. Jensen, Kim G. Larsen, Axel Legay, Sean Sedwards, and Jakob H. Taankvist. Teaching Stratego to Play Ball> Optimal Synthesis fo Continuous Space MDPs. In To appear in Proceedings of ATVA, 2019.
Shyam Lal Karra, Kim Guldstrand Larsen, Florian Lorber, and Jirí Srba. Safe and Time-Optimal Control for Railway Games. In Simon Collart Dutilleul, Thierry Lecomte, and Alexander B. Romanovsky, editors, Reliability, Safety, and Security of Railway Systems. Modelling, Analysis, Verification, and Certification - Third International Conference, RSSRail 2019, Lille, France, June 4-6, 2019, Proceedings, volume 11495 of Lecture Notes in Computer Science, pages 106-122. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-18744-6_7.
Kim G. Larsen, Marius Mikucionis, Marco Muñiz, Jirí Srba, and Jakob Haahr Taankvist. Online and Compositional Learning of Controllers with Application to Floor Heating. In Marsha Chechik and Jean-François Raskin, editors, Tools and Algorithms for the Construction and Analysis of Systems - 22nd International Conference, TACAS 2016, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands, April 2-8, 2016, Proceedings, volume 9636 of Lecture Notes in Computer Science, pages 244-259. Springer, 2016. URL: https://doi.org/10.1007/978-3-662-49674-9_14.
Kim Guldstrand Larsen, Adrien Le Coënt, Marius Mikucionis, and Jakob Haahr Taankvist. Guaranteed Control Synthesis for Continuous Systems in Uppaal Tiga. In Roger D. Chamberlain, Walid Taha, and Martin Törngren, editors, Cyber Physical Systems. Model-Based Design - 8th International Workshop, CyPhy 2018, and 14th International Workshop, WESE 2018, Turin, Italy, October 4-5, 2018, Revised Selected Papers, volume 11615 of Lecture Notes in Computer Science, pages 113-133. Springer, 2018. URL: https://doi.org/10.1007/978-3-030-23703-5_6.
Kim Guldstrand Larsen, Marius Mikucionis, and Jakob Haahr Taankvist. Safe and Optimal Adaptive Cruise Control. In Roland Meyer, André Platzer, and Heike Wehrheim, editors, Correct System Design - Symposium in Honor of Ernst-Rüdiger Olderog on the Occasion of His 60th Birthday, Oldenburg, Germany, September 8-9, 2015. Proceedings, volume 9360 of Lecture Notes in Computer Science, pages 260-277. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-23506-6_17.
Alexander L. Strehl, Lihong Li, and Michael L. Littman. Incremental Model-based Learners With Formal Learning-Time Guarantees. CoRR, 2012. URL: http://arxiv.org/abs/1206.6870.
Christopher John Cornish Hellaby Watkins. Learning from delayed rewards. PhD thesis, King’s College, Cambridge, 1989.

Synthesis of Safe, Optimal and Compact Strategies for Stochastic Hybrid Games (Invited Paper)

Author Kim G. Larsen

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Synthesis of Safe, Optimal and Compact Strategies for Stochastic Hybrid Games (Invited Paper)

Author Kim G. Larsen

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message