Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper)

Authors Avraham Natan , Roni Stern , Meir Kalech



PDF
Thumbnail PDF

File

OASIcs.DX.2024.23.pdf
  • Filesize: 1.08 MB
  • 13 pages

Document Identifiers

Author Details

Avraham Natan
  • Ben-Gurion University of the Negev, Beersheba, Israel
Roni Stern
  • Ben-Gurion University of the Negev, Beersheba, Israel
Meir Kalech
  • Ben-Gurion University of the Negev, Beersheba, Israel

Cite As Get BibTex

Avraham Natan, Roni Stern, and Meir Kalech. Diagnosing Non-Intermittent Anomalies in Reinforcement Learning Policy Executions (Short Paper). In 35th International Conference on Principles of Diagnosis and Resilient Systems (DX 2024). Open Access Series in Informatics (OASIcs), Volume 125, pp. 23:1-23:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/OASIcs.DX.2024.23

Abstract

Reinforcement learning (RL) algorithms output policies specifying which action an agent should take in a given state. However, faults can sometimes arise during policy execution due to internal faults in the agent. As a result, actions may have unexpected effects. In this work, we aim to diagnose such faults and infer their root cause. We consider two types of diagnosis problems. In the first, which we call RLDXw, we assume we only know what a normal execution looks like. In the second, called RLDXs, we assume we have models for the faulty behavior of a component, which we call fault modes. The solution to RLDXw is a time step at which a fault occurred for the first time. The solution to RLDXs is more informative, represented as a fault mode according to which the RL task was executed. Solving those problems is useful in practice to facilitate efficient repair of faulty agents, since it can focus the repair efforts on specific actions. We formally define RLDXw and RLDXs and design two algorithms called WFMa and SFMa for solving them. We evaluate our algorithms on a benchmark of RL domains and discuss their strengths and limitations. When the number of the observed states increases, both WFMa and SFMa report a decrease in runtime (up to significantly 6.5 times faster). Additionally, the runtime of SFMa increases linearly with the increase in candidate fault modes.

Subject Classification

ACM Subject Classification
  • Hardware → Bug detection, localization and diagnosis
  • Theory of computation → Reinforcement learning
Keywords
  • Diagnosis
  • Reinforcement Learning
  • Autonomous Systems

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. On the accuracy of spectrum-based fault localization. In Testing: Academic and industrial conference practice and research techniques-MUTATION (TAICPART-MUTATION 2007), pages 89-98. IEEE, 2007. Google Scholar
  2. Matthew Daigle, Xenofon Koutsoukos, and Gautam Biswas. Distributed diagnosis of coupled mobile robots. In Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., pages 3787-3794. IEEE, 2006. URL: https://doi.org/10.1109/ROBOT.2006.1642281.
  3. Johan De Kleer, Alan K Mackworth, and Raymond Reiter. Characterizing diagnoses and systems. Artificial intelligence, 56(2-3):197-222, 1992. URL: https://doi.org/10.1016/0004-3702(92)90027-U.
  4. Orel Elimelech, Roni Stern, Meir Kalech, and Yedidya Bar-Zeev. Diagnosing resource usage failures in multi-agent systems. Expert Systems with Applications, 77:44-56, 2017. URL: https://doi.org/10.1016/J.ESWA.2017.01.047.
  5. Meir Kalech and Gal A Kaminka. Coordination diagnostic algorithms for teams of situated agents: Scaling up. Computational Intelligence, 27(3):393-421, 2011. URL: https://doi.org/10.1111/J.1467-8640.2011.00386.X.
  6. Meir Kalech and Avraham Natan. Model-based diagnosis of multi-agent systems: A survey. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 12334-12341, 2022. URL: https://doi.org/10.1609/AAAI.V36I11.21498.
  7. Roberto Micalizio and Pietro Torasso. Cooperative monitoring to diagnose multiagent plans. Journal of Artificial Intelligence Research, 51:1-70, 2014. URL: https://doi.org/10.1613/JAIR.4339.
  8. Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pages 1928-1937. PMLR, 2016. Google Scholar
  9. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. nature, 518(7540):529-533, 2015. URL: https://doi.org/10.1038/NATURE14236.
  10. Avraham Natan and Meir Kalech. Privacy-aware distributed diagnosis of multi-agent plans. Expert Systems with Applications, 192:116313, 2022. URL: https://doi.org/10.1016/J.ESWA.2021.116313.
  11. Avraham Natan, Meir Kalech, and Roman Barták. Diagnosis of intermittent faults in multi-agent systems: An sfl approach. Artificial Intelligence, 324:103994, 2023. URL: https://doi.org/10.1016/J.ARTINT.2023.103994.
  12. Avraham Natan, Roni Stern, and Meir Kalech. Distributed spectrum-based fault localization. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 6491-6498, 2023. URL: https://doi.org/10.1609/AAAI.V37I5.25798.
  13. Alexandre Perez, Rui Abreu, and Marcelo d'Amorim. Prevalence of single-fault fixes and its impact on fault localization. In IEEE International Conference on Software Testing, Verification and Validation (ICST), pages 12-22, 2017. URL: https://doi.org/10.1109/ICST.2017.9.
  14. Martin L Puterman. Markov decision processes. Handbooks in operations research and management science, 2:331-434, 1990. Google Scholar
  15. Raymond Reiter. A theory of diagnosis from first principles. Artificial intelligence, 32(1):57-95, 1987. URL: https://doi.org/10.1016/0004-3702(87)90062-2.
  16. Patrick Rodler. How should i compute my candidates? a taxonomy and classification of diagnosis computation algorithms. arXiv preprint arXiv:2207.12583, 2022. URL: https://doi.org/10.48550/arXiv.2207.12583.
  17. Nico Roos, Annette Ten Teije, and Cees Witteveen. A protocol for multi-agent diagnosis with spatially distributed knowledge. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems, pages 655-661, 2003. URL: https://doi.org/10.1145/860575.860681.
  18. Nico Roos and Cees Witteveen. Models and methods for plan diagnosis. Autonomous Agents and Multi-Agent Systems, 19:30-52, 2009. URL: https://doi.org/10.1007/S10458-007-9017-6.
  19. Michael Schmid, Emanuel Gebauer, Christian Hanzl, and Christian Endisch. Active model-based fault diagnosis in reconfigurable battery systems. IEEE Transactions on Power Electronics, 36(3):2584-2597, 2020. Google Scholar
  20. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. URL: https://arxiv.org/abs/1707.06347.
  21. Peter Struss and Oskar Dressler. " physical negation" integrating fault models into the general diagnostic engine. In IJCAI, volume 89, pages 1318-1323, 1989. URL: http://ijcai.org/Proceedings/89-2/Papers/075.pdf.
  22. Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018. Google Scholar
  23. Gianluca Torta and Roberto Micalizio. Smt-based diagnosis of multi-agent temporal plans. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pages 2097-2099, 2018. URL: http://dl.acm.org/citation.cfm?id=3238084.
  24. Mark Towers, Jordan K. Terry, Ariel Kwiatkowski, John U. Balis, Gianluca de Cola, Tristan Deleu, Manuel Goulão, Andreas Kallinteris, Arjun KG, Markus Krimmel, Rodrigo Perez-Vicente, Andrea Pierré, Sander Schulhoff, Jun Jet Tai, Andrew Tan Jin Shen, and Omar G. Younis. Gymnasium, March 2023. URL: https://doi.org/10.5281/zenodo.8127026.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail