DeepTrust^RT: Confidential Deep Neural Inference Meets Real-Time!

Authors Mohammad Fakhruddin Babar , Monowar Hasan



PDF
Thumbnail PDF

File

LIPIcs.ECRTS.2024.13.pdf
  • Filesize: 1.46 MB
  • 24 pages

Document Identifiers

Author Details

Mohammad Fakhruddin Babar
  • Electrical Engineering and Computer Science, Washington State University, Pullman, WA, USA
Monowar Hasan
  • Electrical Engineering and Computer Science, Washington State University, Pullman, WA, USA

Cite AsGet BibTex

Mohammad Fakhruddin Babar and Monowar Hasan. DeepTrust^RT: Confidential Deep Neural Inference Meets Real-Time!. In 36th Euromicro Conference on Real-Time Systems (ECRTS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 298, pp. 13:1-13:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ECRTS.2024.13

Abstract

Deep Neural Networks (DNNs) are becoming common in "learning-enabled" time-critical applications such as autonomous driving and robotics. One approach to protect DNN inference from adversarial actions and preserve model privacy/confidentiality is to execute them within trusted enclaves available in modern processors. However, running DNN inference inside limited-capacity enclaves while ensuring timing guarantees is challenging due to (a) large size of DNN workloads and (b) extra switching between "normal" and "trusted" execution modes. This paper introduces new time-aware scheduling schemes - DeepTrust^RT - to securely execute deep neural inferences for learning-enabled real-time systems. We first propose a variant of EDF (called DeepTrust^RT-LW) that slices each DNN layer and runs them sequentially in the enclave. However, due to extra context switch overheads of individual layer slices, we further introduce a novel layer fusion technique (named DeepTrust^RT-FUSION). Our proposed scheme provides hard real-time guarantees by fusing multiple layers of DNN workload from multiple tasks; thus allowing them to fit and run concurrently within the enclaves while maintaining real-time guarantees. We implemented and tested DeepTrust^RT ideas on the Raspberry Pi platform running OP-TEE+DarkNet-TZ DNN APIs and three DNN workloads (AlexNet-squeezed, Tiny Darknet, YOLOv3-tiny). Compared to the layer-wise partitioning approach (DeepTrust^RT-LW), DeepTrust^RT-FUSION can schedule up to 3x more tasksets and reduce context switches by up to 11.12x. We further demonstrate the efficacy of DeepTrust^RT using a flight controller (ArduPilot) case study and find that DeepTrust^RT-FUSION retains real-time guarantees where DeepTrust^RT-LW becomes unschedulable.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time systems
  • Security and privacy → Systems security
Keywords
  • DNN
  • TrustZone
  • Real-Time Systems

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. URL: https://github.com/ArduPilot/ardupilot.
  2. Intel software guard extensions: Intel SGX SDK for Linux OS. https://intel.com. accessed: 2020-06-30.
  3. Pranav Adarsh, Pratibha Rathi, and Manoj Kumar. Yolo v3-tiny: Object detection and recognition using one stage improved model. In 2020 6th international conference on advanced computing and communication systems (ICACCS), pages 687-694. IEEE, 2020. Google Scholar
  4. Mohamed Anis Aguida and Monowar Hasan. Work in progress: Exploring schedule-based side-channels in TrustZone-enabled real-time systems. In 2022 IEEE 28th Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 301-304. IEEE, 2022. Google Scholar
  5. Tiago Alves. TrustZone: Integrated hardware and software security. Information Quarterly, 3:18-24, 2004. Google Scholar
  6. Mohammad Fakhruddin Babar and Monowar Hasan. CPS2RL/DeepTrust-RT. Software, NSF 2312006, WSU PG00021441 (visited on 06/06/2024). URL: https://github.com/CPS2RL/DeepTrust-RT.
  7. Mohammad Fakhruddin Babar and Monowar Hasan. Real-time scheduling of TrustZone-enabled DNN workloads. In Proceedings of the 4th Workshop on CPS & IoT Security and Privacy, pages 63-69, 2022. Google Scholar
  8. Mohammad Fakhruddin Babar and Monowar Hasan. Real-time scheduling of Trustzone-enabled DNN workloads. In Proceedings of the 4th Workshop on CPS & IoT Security and Privacy, pages 63-69, 2022. Google Scholar
  9. Mohammad Fakhruddin Babar and Monowar Hasan. Trusted deep neural execution—a survey. IEEE Access, 2023. Google Scholar
  10. Soroush Bateni and Cong Liu. ApNet: Approximation-aware real-time neural network. In 2018 IEEE Real-Time Systems Symposium (RTSS), pages 67-79. IEEE, 2018. Google Scholar
  11. Hadjer Benkraouda and Klara Nahrstedt. Image reconstruction attacks on distributed machine learning models. In Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning, pages 29-35, 2021. Google Scholar
  12. Chien-Ying Chen, Debopam Sanyal, and Sibin Mohan. Indistinguishability prevents scheduler side channels in real-time systems. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 666-684, 2021. Google Scholar
  13. Jian-Jia Chen. Partitioned multiprocessor fixed-priority scheduling of sporadic real-time tasks. In 2016 28th Euromicro Conference on Real-Time Systems (ECRTS), pages 251-261. IEEE, 2016. Google Scholar
  14. Zitao Chen, Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben. BinFI: An efficient fault injector for safety-critical machine learning systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1-23, 2019. Google Scholar
  15. Robert I Davis. A review of fixed priority and EDF scheduling for hard real-time uniprocessor systems. ACM SIGBED Review, 11(1):8-19, 2014. Google Scholar
  16. Tarek Elgamal and Klara Nahrstedt. Serdab: An IoT framework for partitioning neural networks computation across multiple enclaves. In 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pages 519-528. IEEE, 2020. Google Scholar
  17. Akshay Gangal, Mengmei Ye, and Sheng Wei. HybridTEE: Secure mobile DNN execution using hybrid trusted execution environment. In Asian Hardware Oriented Security and Trust Symposium (AsianHOST), pages 1-6. IEEE, 2020. Google Scholar
  18. Amir Gholami, Kiseok Kwon, Bichen Wu, Zizheng Tai, Xiangyu Yue, Peter Jin, Sicheng Zhao, and Kurt Keutzer. SqueezeNext: Hardware-aware neural network design. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 1638-1647, 2018. Google Scholar
  19. Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint, 2015. URL: https://arxiv.org/abs/1510.00149.
  20. Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015. Google Scholar
  21. Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint, 2016. URL: https://arxiv.org/abs/1602.07360.
  22. Md Shihabul Islam, Mahmoud Zamani, Chung Hwan Kim, Latifur Khan, and Kevin W. Hamlen. Confidential execution of deep learning inference at the untrusted edge with ARM TrustZone. In Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy, CODASPY'23, pages 153-164, New York, NY, USA, 2023. Association for Computing Machinery. URL: https://doi.org/10.1145/3577923.3583648.
  23. Kyungtae Kim, Chung Hwan Kim, Junghwan "John" Rhee, Xiao Yu, Haifeng Chen, Dave Tian, and Byoungyoung Lee. Vessels: Efficient and scalable deep learning prediction on trusted processors. In Proceedings of the 11th ACM Symposium on Cloud Computing, pages 462-476, 2020. Google Scholar
  24. Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, and Matthias Grundmann. On-device neural net inference with mobile GPUs. arXiv preprint, 2019. URL: https://arxiv.org/abs/1907.01989.
  25. Seulki Lee and Shahriar Nirjon. SubFlow: A dynamic induced-subgraph strategy toward real-time dnn inference and training. In 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 15-29. IEEE, 2020. Google Scholar
  26. Taegyeong Lee, Zhiqi Lin, Saumay Pushp, Caihua Li, Yunxin Liu, Youngki Lee, Fengyuan Xu, Chenren Xu, Lintao Zhang, and Junehwa Song. Occlumency: Privacy-preserving remote deep-learning inference using SGX. In The 25th Annual International Conference on Mobile Computing and Networking, pages 1-17, 2019. Google Scholar
  27. Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben. TensorFI: A configurable fault injector for TensorFlow applications. In IEEE International symposium on software reliability engineering workshops (ISSREW), pages 313-320. IEEE, 2018. Google Scholar
  28. Ninghui Li, Wahbeh Qardaji, Dong Su, Yi Wu, and Weining Yang. Membership privacy: A unifying framework for privacy definitions. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 889-900, 2013. Google Scholar
  29. Linaro. Open portable trusted execution environment. https://www.op-tee.org, Accessed on 2021.
  30. Chung Laung Liu and James W Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM (JACM), 20(1):46-61, 1973. Google Scholar
  31. Renju Liu, Luis Garcia, Zaoxing Liu, Botong Ou, and Mani Srivastava. SecDeep: Secure and performant on-device deep learning inference framework for mobile and IoT devices. In Proceedings of the International Conference on Internet-of-Things Design and Implementation, pages 67-79, 2021. Google Scholar
  32. Fan Mo, Ali Shahin Shamsabadi, Kleomenis Katevas, Soteris Demetriou, Ilias Leontiadis, Andrea Cavallaro, and Hamed Haddadi. DarkneTZ: towards model privacy at the edge using trusted execution environments. In Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services, pages 161-174, 2020. Google Scholar
  33. Alistair Moffat. Huffman coding. ACM Computing Surveys (CSUR), 52(4):1-35, 2019. Google Scholar
  34. Anway Mukherjee, Tanmaya Mishra, Thidapat Chantem, Nathan Fisher, and Ryan Gerdes. Optimized trusted execution for hard real-time applications on COTS processors. In Proceedings of the 27th International Conference on Real-Time Networks and Systems, pages 50-60, 2019. Google Scholar
  35. Joseph Redmon. Darknet: Open source neural networks in C. http://pjreddie.com/darknet/, 2013-2016.
  36. Matt Richardson and Shawn Wallace. Getting started with Raspberry Pi. O'Reilly Media, Inc., 2012. Google Scholar
  37. Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3-18. IEEE, 2017. Google Scholar
  38. Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint, 2014. URL: https://arxiv.org/abs/1409.1556.
  39. Abhishek Singh and Sanjoy Baruah. Global EDF-based scheduling of multiple independent synchronous dataflow graphs. In 2017 IEEE Real-Time Systems Symposium (RTSS), pages 307-318, 2017. URL: https://doi.org/10.1109/RTSS.2017.00036.
  40. John A Stankovic, Marco Spuri, Krithi Ramamritham, and Giorgio Buttazzo. Deadline scheduling for real-time systems: EDF and related algorithms, volume 460. Springer Science & Business Media, 1998. Google Scholar
  41. Peter M VanNostrand, Ioannis Kyriazis, Michelle Cheng, Tian Guo, and Robert J Walls. Confidential deep learning: Executing proprietary models on untrusted devices. arXiv preprint, 2019. URL: https://arxiv.org/abs/1908.10730.
  42. Yecheng Xiang, Yidi Wang, Hyunjong Choi, Mohsen Karimi, and Hyoseung Kim. AegisDNN: Dependable and timely execution of DNN tasks with SGX. In IEEE Real-Time Systems Symposium (RTSS), pages 68-81. IEEE, 2021. Google Scholar
  43. Fengxiang Zhang and Alan Burns. Schedulability analysis for real-time systems with EDF scheduling. IEEE Transactions on Computers, 58(9):1250-1258, 2009. Google Scholar