Quantifying the Sim-To-Real Gap in UAV Disturbance Rejection

Authors Austin Coursey , Marcos Quinones-Grueiro , Gautam Biswas



PDF
Thumbnail PDF

File

OASIcs.DX.2024.16.pdf
  • Filesize: 4.57 MB
  • 18 pages

Document Identifiers

Author Details

Austin Coursey
  • Institute for Software Integrated Systems, Vanderbilt University, Nashville, TN, USA
Marcos Quinones-Grueiro
  • Institute for Software Integrated Systems, Vanderbilt University, Nashville, TN, USA
Gautam Biswas
  • Institute for Software Integrated Systems, Vanderbilt University, Nashville, TN, USA

Acknowledgements

We want to thank Luis Alvarez at MIT Lincoln Laboratory for his help in running UAV flights.

Cite As Get BibTex

Austin Coursey, Marcos Quinones-Grueiro, and Gautam Biswas. Quantifying the Sim-To-Real Gap in UAV Disturbance Rejection. In 35th International Conference on Principles of Diagnosis and Resilient Systems (DX 2024). Open Access Series in Informatics (OASIcs), Volume 125, pp. 16:1-16:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/OASIcs.DX.2024.16

Abstract

Due to the safety risks and training sample inefficiency, it is often preferred to develop controllers in simulation. However, minor differences between the simulation and the real world can cause a significant sim-to-real gap. This gap can reduce the effectiveness of the developed controller. In this paper, we examine a case study of transferring an octorotor reinforcement learning controller from simulation to the real world. First, we quantify the effectiveness of the real-world transfer by examining safety metrics. We find that although there is a noticeable (around 100%) increase in deviation in real flights, this deviation may not be considered unsafe, as it will be within > 2m safety corridors. Then, we estimate the densities of the measurement distributions and compare the Jensen-Shannon divergences of simulated and real measurements. From this, we show that the vehicle’s orientation is significantly different between simulated and real flights. We attribute this to a different flight mode in real flights where the vehicle turns to face the next waypoint. We also find that the reinforcement learning controller actions appear to correctly counteract disturbance forces. Then, we analyze the errors of a measurement autoencoder and state transition model neural network applied to real data. We find that these models further reinforce the difference between the simulated and real attitude control, showing the errors directly on the flight paths. Finally, we discuss important lessons learned in the sim-to-real transfer of our controller.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Control methods
  • Computing methodologies → Model development and analysis
Keywords
  • sim-to-real
  • disturbance rejection
  • unmanned aerial vehicles

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ibrahim Ahmed, Marcos Quinones-Grueiro, and Gautam Biswas. A high-fidelity simulation test-bed for fault-tolerant octo-rotor control using reinforcement learning. In 2022 IEEE/AIAA 41st Digital Avionics Systems Conference (DASC). IEEE, 2022. Google Scholar
  2. OpenAI: Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, et al. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3-20, 2020. URL: https://doi.org/10.1177/0278364919887447.
  3. James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. Algorithms for hyper-parameter optimization. Advances in neural information processing systems, 24, 2011. Google Scholar
  4. Omar Bouhamed, Hakim Ghazzai, Hichem Besbes, and Yehia Massoud. Autonomous uav navigation: A ddpg-based deep reinforcement learning approach. In 2020 IEEE International Symposium on circuits and systems (ISCAS), pages 1-5. IEEE, 2020. URL: https://doi.org/10.1109/ISCAS45731.2020.9181245.
  5. Jack Collins, David Howard, and Jurgen Leitner. Quantifying the reality gap in robotic manipulation tasks. In 2019 International Conference on Robotics and Automation (ICRA), pages 6706-6712. IEEE, 2019. Google Scholar
  6. Austin Coursey, Allan Zhang, Marcos Quinones-Grueiro, and Gautam Biswas. Hybrid control framework of uavs under varying wind and payload conditions. In 2024 American Control Conference (ACC). IFAC, 2024. Google Scholar
  7. Iván García Daza, Rubén Izquierdo, Luis Miguel Martínez, Ola Benderius, and David Fernández Llorca. Sim-to-real transfer and reality gap modeling in model predictive control for autonomous driving. Applied Intelligence, 53(10):12719-12735, 2023. URL: https://doi.org/10.1007/S10489-022-04148-1.
  8. Tamara Regina Dieter, Andreas Weinmann, Stefan Jäger, and Eva Brucherseifer. Quantifying the simulation-reality gap for deep learning-based drone detection. Electronics, 12(10):2197, 2023. Google Scholar
  9. Yu Ding, Liang Ma, Jian Ma, Mingliang Suo, Laifa Tao, Yujie Cheng, and Chen Lu. Intelligent fault diagnosis for rotating machinery using deep q-network based health state classification: A deep reinforcement learning approach. Advanced Engineering Informatics, 42:100977, 2019. URL: https://doi.org/10.1016/J.AEI.2019.100977.
  10. Saite Fan, Xinmin Zhang, and Zhihuan Song. Imbalanced sample selection with deep reinforcement learning for fault diagnosis. IEEE Transactions on Industrial Informatics, 18(4):2518-2527, 2021. URL: https://doi.org/10.1109/TII.2021.3100284.
  11. Mukesh Gautam. Deep reinforcement learning for resilient power and energy systems: Progress, prospects, and future avenues. Electricity, 4(4):336-380, 2023. Google Scholar
  12. Bryan S Guevara, Luis F Recalde, José Varela-Aldás, Victor H Andaluz, Daniel C Gandolfo, and Juan M Toibero. A comparative study between nmpc and baseline feedback controllers for uav trajectory tracking. Drones, 7(2):144, 2023. Google Scholar
  13. Peide Huang, Xilun Zhang, Ziang Cao, Shiqi Liu, Mengdi Xu, Wenhao Ding, Jonathan Francis, Bingqing Chen, and Ding Zhao. What went wrong? closing the sim-to-real gap via differentiable causal discovery. In Conference on Robot Learning, pages 734-760. PMLR, 2023. URL: https://proceedings.mlr.press/v229/huang23c.html.
  14. Fan Jiang, Farhad Pourpanah, and Qi Hao. Design, implementation, and evaluation of a neural-network-based quadcopter uav system. IEEE Transactions on Industrial Electronics, 67(3):2076-2085, 2019. URL: https://doi.org/10.1109/TIE.2019.2905808.
  15. Bodi Ma, Zhenbao Liu, Qingqing Dang, Wen Zhao, Jingyan Wang, Yao Cheng, and Zhirong Yuan. Deep reinforcement learning of uav tracking control under wind disturbances environments. IEEE Transactions on Instrumentation and Measurement, 72:1-13, 2023. URL: https://doi.org/10.1109/TIM.2023.3265741.
  16. Sihem Ouahouah, Miloud Bagaa, Jonathan Prados-Garzon, and Tarik Taleb. Deep-reinforcement-learning-based collision avoidance in uav environment. IEEE Internet of Things Journal, 9(6):4015-4030, 2021. URL: https://doi.org/10.1109/JIOT.2021.3118949.
  17. Rafael Perez-Segui, Pedro Arias-Perez, Javier Melero-Deza, Miguel Fernandez-Cortizas, David Perez-Saura, and Pascual Campoy. Bridging the gap between simulation and real autonomous uav flights in industrial applications. Aerospace, 10(9):814, 2023. Google Scholar
  18. Gensheng Qian and Jingquan Liu. Development of deep reinforcement learning-based fault diagnosis method for rotating machinery in nuclear power plants. Progress in Nuclear Energy, 152:104401, 2022. Google Scholar
  19. Erica Salvato, Gianfranco Fenu, Eric Medvet, and Felice Andrea Pellegrino. Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning. IEEE Access, 9:153171-153187, 2021. URL: https://doi.org/10.1109/ACCESS.2021.3126658.
  20. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. URL: https://arxiv.org/abs/1707.06347.
  21. Skipper Seabold and Josef Perktold. statsmodels: Econometric and statistical modeling with python. In 9th Python in Science Conference, 2010. Google Scholar
  22. David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140-1144, 2018. Google Scholar
  23. Yves Sohège, Marcos Quiñones-Grueiro, and Gregory Provan. A novel hybrid approach for fault-tolerant control of uavs based on robust reinforcement learning. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 10719-10725. IEEE, 2021. Google Scholar
  24. Charalambos Soteriou, Christos Kyrkou, and Panayiotis S Kolios. Closing the sim-to-real gap: Enhancing autonomous precision landing of uavs with detection-informed deep reinforcement learning. In International Conference on Deep Learning Theory and Applications, pages 176-190. Springer, 2024. Google Scholar
  25. Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, and Vincent Vanhoucke. Sim-to-real: Learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332, 2018. URL: https://arxiv.org/abs/1804.10332.
  26. Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 23-30. IEEE, 2017. URL: https://doi.org/10.1109/IROS.2017.8202133.
  27. Daichi Wada, Sergio Araujo-Estrada, and Shane Windsor. Sim-to-real transfer for fixed-wing uncrewed aerial vehicle: pitch control by high-fidelity modelling and domain randomization. IEEE Robotics and Automation Letters, 7(4):11735-11742, 2022. URL: https://doi.org/10.1109/LRA.2022.3205442.
  28. Peter R Wurman, Samuel Barrett, Kenta Kawamoto, James MacGlashan, Kaushik Subramanian, Thomas J Walsh, Roberto Capobianco, Alisa Devlic, Franziska Eckert, Florian Fuchs, et al. Outracing champion gran turismo drivers with deep reinforcement learning. Nature, 602(7896):223-228, 2022. URL: https://doi.org/10.1038/S41586-021-04357-7.
  29. Lin-Xing Xu, Hong-Jun Ma, Dong Guo, An-Huan Xie, and Da-Lei Song. Backstepping sliding-mode and cascade active disturbance rejection control for a quadrotor uav. IEEE/ASME Transactions on Mechatronics, 25(6):2743-2753, 2020. Google Scholar
  30. Xiaolin Xu MS and Jeffrey Sun. A new trajectory in uav safety: Leveraging reinforcement learning for distance maintenance under wind variations. Journal of Aviation/Aerospace Education & Research, 33(4):6, 2024. Google Scholar
  31. Wenshuai Zhao, Jorge Peña Queralta, and Tomi Westerlund. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In 2020 IEEE symposium series on computational intelligence (SSCI), pages 737-744. IEEE, 2020. URL: https://doi.org/10.1109/SSCI47803.2020.9308468.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail