DeepTrust^RT: Confidential Deep Neural Inference Meets Real-Time!

Babar, Mohammad Fakhruddin; Hasan, Monowar

doi:10.4230/LIPIcs.ECRTS.2024.13

Abstract

Deep Neural Networks (DNNs) are becoming common in "learning-enabled" time-critical applications such as autonomous driving and robotics. One approach to protect DNN inference from adversarial actions and preserve model privacy/confidentiality is to execute them within trusted enclaves available in modern processors. However, running DNN inference inside limited-capacity enclaves while ensuring timing guarantees is challenging due to (a) large size of DNN workloads and (b) extra switching between "normal" and "trusted" execution modes. This paper introduces new time-aware scheduling schemes - DeepTrust^RT - to securely execute deep neural inferences for learning-enabled real-time systems. We first propose a variant of EDF (called DeepTrust^RT-LW) that slices each DNN layer and runs them sequentially in the enclave. However, due to extra context switch overheads of individual layer slices, we further introduce a novel layer fusion technique (named DeepTrust^RT-FUSION). Our proposed scheme provides hard real-time guarantees by fusing multiple layers of DNN workload from multiple tasks; thus allowing them to fit and run concurrently within the enclaves while maintaining real-time guarantees. We implemented and tested DeepTrust^RT ideas on the Raspberry Pi platform running OP-TEE+DarkNet-TZ DNN APIs and three DNN workloads (AlexNet-squeezed, Tiny Darknet, YOLOv3-tiny). Compared to the layer-wise partitioning approach (DeepTrust^RT-LW), DeepTrust^RT-FUSION can schedule up to 3x more tasksets and reduce context switches by up to 11.12x. We further demonstrate the efficacy of DeepTrust^RT using a flight controller (ArduPilot) case study and find that DeepTrust^RT-FUSION retains real-time guarantees where DeepTrust^RT-LW becomes unschedulable.

URL: https://github.com/ArduPilot/ardupilot.
Intel software guard extensions: Intel SGX SDK for Linux OS. https://intel.com. accessed: 2020-06-30.
Pranav Adarsh, Pratibha Rathi, and Manoj Kumar. Yolo v3-tiny: Object detection and recognition using one stage improved model. In 2020 6th international conference on advanced computing and communication systems (ICACCS), pages 687-694. IEEE, 2020.
Mohamed Anis Aguida and Monowar Hasan. Work in progress: Exploring schedule-based side-channels in TrustZone-enabled real-time systems. In 2022 IEEE 28th Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 301-304. IEEE, 2022.
Tiago Alves. TrustZone: Integrated hardware and software security. Information Quarterly, 3:18-24, 2004.
Mohammad Fakhruddin Babar and Monowar Hasan. CPS2RL/DeepTrust-RT. Software, NSF 2312006, WSU PG00021441 (visited on 06/06/2024). URL: https://github.com/CPS2RL/DeepTrust-RT.
Mohammad Fakhruddin Babar and Monowar Hasan. Real-time scheduling of TrustZone-enabled DNN workloads. In Proceedings of the 4th Workshop on CPS & IoT Security and Privacy, pages 63-69, 2022.
Mohammad Fakhruddin Babar and Monowar Hasan. Real-time scheduling of Trustzone-enabled DNN workloads. In Proceedings of the 4th Workshop on CPS & IoT Security and Privacy, pages 63-69, 2022.
Mohammad Fakhruddin Babar and Monowar Hasan. Trusted deep neural execution—a survey. IEEE Access, 2023.
Soroush Bateni and Cong Liu. ApNet: Approximation-aware real-time neural network. In 2018 IEEE Real-Time Systems Symposium (RTSS), pages 67-79. IEEE, 2018.
Hadjer Benkraouda and Klara Nahrstedt. Image reconstruction attacks on distributed machine learning models. In Proceedings of the 2nd ACM International Workshop on Distributed Machine Learning, pages 29-35, 2021.
Chien-Ying Chen, Debopam Sanyal, and Sibin Mohan. Indistinguishability prevents scheduler side channels in real-time systems. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 666-684, 2021.
Jian-Jia Chen. Partitioned multiprocessor fixed-priority scheduling of sporadic real-time tasks. In 2016 28th Euromicro Conference on Real-Time Systems (ECRTS), pages 251-261. IEEE, 2016.
Zitao Chen, Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben. BinFI: An efficient fault injector for safety-critical machine learning systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1-23, 2019.
Robert I Davis. A review of fixed priority and EDF scheduling for hard real-time uniprocessor systems. ACM SIGBED Review, 11(1):8-19, 2014.
Tarek Elgamal and Klara Nahrstedt. Serdab: An IoT framework for partitioning neural networks computation across multiple enclaves. In 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pages 519-528. IEEE, 2020.
Akshay Gangal, Mengmei Ye, and Sheng Wei. HybridTEE: Secure mobile DNN execution using hybrid trusted execution environment. In Asian Hardware Oriented Security and Trust Symposium (AsianHOST), pages 1-6. IEEE, 2020.
Amir Gholami, Kiseok Kwon, Bichen Wu, Zizheng Tai, Xiangyu Yue, Peter Jin, Sicheng Zhao, and Kurt Keutzer. SqueezeNext: Hardware-aware neural network design. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 1638-1647, 2018.
Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint, 2015. URL: https://arxiv.org/abs/1510.00149.
Song Han, Jeff Pool, John Tran, and William Dally. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015.
Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 mb model size. arXiv preprint, 2016. URL: https://arxiv.org/abs/1602.07360.
Md Shihabul Islam, Mahmoud Zamani, Chung Hwan Kim, Latifur Khan, and Kevin W. Hamlen. Confidential execution of deep learning inference at the untrusted edge with ARM TrustZone. In Proceedings of the Thirteenth ACM Conference on Data and Application Security and Privacy, CODASPY'23, pages 153-164, New York, NY, USA, 2023. Association for Computing Machinery. URL: https://doi.org/10.1145/3577923.3583648.
Kyungtae Kim, Chung Hwan Kim, Junghwan "John" Rhee, Xiao Yu, Haifeng Chen, Dave Tian, and Byoungyoung Lee. Vessels: Efficient and scalable deep learning prediction on trusted processors. In Proceedings of the 11th ACM Symposium on Cloud Computing, pages 462-476, 2020.
Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, and Matthias Grundmann. On-device neural net inference with mobile GPUs. arXiv preprint, 2019. URL: https://arxiv.org/abs/1907.01989.
Seulki Lee and Shahriar Nirjon. SubFlow: A dynamic induced-subgraph strategy toward real-time dnn inference and training. In 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 15-29. IEEE, 2020.
Taegyeong Lee, Zhiqi Lin, Saumay Pushp, Caihua Li, Yunxin Liu, Youngki Lee, Fengyuan Xu, Chenren Xu, Lintao Zhang, and Junehwa Song. Occlumency: Privacy-preserving remote deep-learning inference using SGX. In The 25th Annual International Conference on Mobile Computing and Networking, pages 1-17, 2019.
Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben. TensorFI: A configurable fault injector for TensorFlow applications. In IEEE International symposium on software reliability engineering workshops (ISSREW), pages 313-320. IEEE, 2018.
Ninghui Li, Wahbeh Qardaji, Dong Su, Yi Wu, and Weining Yang. Membership privacy: A unifying framework for privacy definitions. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 889-900, 2013.
Linaro. Open portable trusted execution environment. https://www.op-tee.org, Accessed on 2021.
Chung Laung Liu and James W Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the ACM (JACM), 20(1):46-61, 1973.
Renju Liu, Luis Garcia, Zaoxing Liu, Botong Ou, and Mani Srivastava. SecDeep: Secure and performant on-device deep learning inference framework for mobile and IoT devices. In Proceedings of the International Conference on Internet-of-Things Design and Implementation, pages 67-79, 2021.
Fan Mo, Ali Shahin Shamsabadi, Kleomenis Katevas, Soteris Demetriou, Ilias Leontiadis, Andrea Cavallaro, and Hamed Haddadi. DarkneTZ: towards model privacy at the edge using trusted execution environments. In Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services, pages 161-174, 2020.
Alistair Moffat. Huffman coding. ACM Computing Surveys (CSUR), 52(4):1-35, 2019.
Anway Mukherjee, Tanmaya Mishra, Thidapat Chantem, Nathan Fisher, and Ryan Gerdes. Optimized trusted execution for hard real-time applications on COTS processors. In Proceedings of the 27th International Conference on Real-Time Networks and Systems, pages 50-60, 2019.
Joseph Redmon. Darknet: Open source neural networks in C. http://pjreddie.com/darknet/, 2013-2016.
Matt Richardson and Shawn Wallace. Getting started with Raspberry Pi. O'Reilly Media, Inc., 2012.
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3-18. IEEE, 2017.
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint, 2014. URL: https://arxiv.org/abs/1409.1556.
Abhishek Singh and Sanjoy Baruah. Global EDF-based scheduling of multiple independent synchronous dataflow graphs. In 2017 IEEE Real-Time Systems Symposium (RTSS), pages 307-318, 2017. URL: https://doi.org/10.1109/RTSS.2017.00036.
John A Stankovic, Marco Spuri, Krithi Ramamritham, and Giorgio Buttazzo. Deadline scheduling for real-time systems: EDF and related algorithms, volume 460. Springer Science & Business Media, 1998.
Peter M VanNostrand, Ioannis Kyriazis, Michelle Cheng, Tian Guo, and Robert J Walls. Confidential deep learning: Executing proprietary models on untrusted devices. arXiv preprint, 2019. URL: https://arxiv.org/abs/1908.10730.
Yecheng Xiang, Yidi Wang, Hyunjong Choi, Mohsen Karimi, and Hyoseung Kim. AegisDNN: Dependable and timely execution of DNN tasks with SGX. In IEEE Real-Time Systems Symposium (RTSS), pages 68-81. IEEE, 2021.
Fengxiang Zhang and Alan Burns. Schedulability analysis for real-time systems with EDF scheduling. IEEE Transactions on Computers, 58(9):1250-1258, 2009.

DeepTrust^RT: Confidential Deep Neural Inference Meets Real-Time!

Authors Mohammad Fakhruddin Babar , Monowar Hasan

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

DeepTrust^RT: Confidential Deep Neural Inference Meets Real-Time!

Authors Mohammad Fakhruddin Babar , Monowar Hasan

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message