Using Quantile Regression in Neural Networks for Contention Prediction in Multicore Processors

Brando, Axel; Serra, Isabel; Mezzetti, Enrico; Abella, Jaume; Cazorla, Francisco J.

doi:10.4230/LIPIcs.ECRTS.2022.4

Abstract

The development of multicore-based embedded real-time systems is a complex process that encompasses several phases. During the software design and development phases (DDP), and prior to the validation phase, key decisions are taken that cover several aspects of the system under development, from hardware selection and configuration, to the identification and mapping of software functions to the processing nodes. The timing dimension steers a large fraction of those decisions as the correctness of the final system ultimately depends on the implemented functions being able to execute within the allotted time budgets. Early execution time figures already in the DDP are thus needed to prevent flawed design decisions resulting in timing misbehavior being intercepted at the timing analysis step in the advanced development phases, when rolling back to different design decisions is extremely onerous. Multicore timing interference compounds this situation as it has been shown to largely impact execution time of tasks and, therefore, must be factored in when deriving early timing bounds. To effectively prevent misconfigurations while preserving resource efficiency, early timing estimates, typically derived from previous projects or early versions of the software functions, should conservatively and tightly overestimate the timing requirements of the final system configuration including multicore contention. In this work, we show that multi-linear regression (MLR) models and neural network (NN) models can be used to predict the impact of multicore contention on tasks' execution time and hence, derive contention-aware early time budgets, as soon as a release (binary) of the application is available. However, those techniques widely used in the mainstream domain minimize the average/mean case and the predicted impact of contention frequently underestimates the impact that can potentially arise at run time. In order to cover this gap, we propose the use of quantile regression neural networks (QRNN), which are specifically designed to predict the desired high quantile. QRNN reduces the number of underestimations compared to MLR and NN models while containing the overestimation by preserving the high quality prediction. For a set of workloads composed by representative kernels running on a NXP T2080 processor, QRNN reduces the number of underestimations to 8.8% compared to 46.8% and 31.3% for MLR and NN models respectively, while keeping the average over estimation in 1%. QRNN exposes a parameter, the target quantile, that allows controlling the behavior of the predictions so it adapts to user’s needs.

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pages 265-283, 2016. URL: https://doi.org/http://10.5281/zenodo.4724123.
Saad Albawi, Tareq Abed Mohammed, and Saad Al-Zawi. Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET), pages 1-6. IEEE, 2017. URL: https://doi.org/10.1109/ICEngTechnol.2017.8308186.
Peter Altenbernd, Jan Gustafsson, Björn Lisper, and Friedhelm Stappert. Early execution time-estimation through automatically generated timing models. Real-Time Systems, 52(6):731-760, 2016. URL: https://doi.org/10.1007/s11241-016-9250-7.
Abderaouf Nassim Amalou, Isabelle Puaut, and Gilles Muller. WE-HML: hybrid WCET estimation using machine learning for architectures with caches. In RTCSA 2021 - 27th IEEE International Conference on Embedded Real-Time Computing Systems and Applications, pages 1-10, Online Virtual Conference, France, August 2021. IEEE. URL: https://hal.inria.fr/hal-03280177.
Jakob Axelsson. A method for evaluating uncertainties in the early development phases of embedded real-time systems. In RTCSA, 2005. URL: https://doi.org/10.1109/RTCSA.2005.12.
Armelle Bonenfant, Denis Claraz, Marianne de Michiel, and Pascal Sotin. Early WCET Prediction Using Machine Learning. In Jan Reineke, editor, 17th International Workshop on Worst-Case Execution Time Analysis (WCET 2017), volume 57 of OpenAccess Series in Informatics (OASIcs), pages 5:1-5:9, Dagstuhl, Germany, 2017. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/OASIcs.WCET.2017.5.
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URL: http://github.com/google/jax.
Axel Brando, Joan Gimeno, Jose A Rodríguez-Serrano, and Jordi Vitrià. Deep non-crossing quantiles through the partial derivative. International Conference on Artificial Intelligence and Statistics, 2022. URL: https://proceedings.mlr.press/v151/brando22a.html.
Certification Authorities Software Team. CAST-32A Multi-core Processors, 2016.
Tianfeng Chai and Roland R Draxler. Root mean square error (rmse) or mean absolute error (mae)?-arguments against avoiding rmse in the literature. Geoscientific model development, 7(3):1247-1250, 2014. URL: https://doi.org/10.5194/gmd-7-1247-2014.
Yuxia Cheng, Wenzhi Chen, Zonghui Wang, and Yang Xiang. Precise contention-aware performance prediction on virtualized multicore system. Journal of Systems Architecture, 72:42-50, 2017. Design Automation for Embedded Ubiquitous Computing Systems. URL: https://doi.org/10.1016/j.sysarc.2016.06.006.
François Chollet. Keras, 2015. URL: https://github.com/fchollet/keras.
Francois Chollet. Deep Learning with Python. Manning Publications Co., USA, 1st edition, 2017. URL: https://doi.org/doi/10.5555/3203489.
Cédric Courtaud, Julien Sopena, Gilles Muller, and Daniel Gracia Pérez. Improving prediction accuracy of memory interferences for multicore platforms. In 2019 IEEE Real-Time Systems Symposium (RTSS), pages 246-259, 2019. URL: https://doi.org/10.1109/RTSS46320.2019.00031.
Harris Drucker, Christopher J Burges, Linda Kaufman, Alex Smola, and Vladimir Vapnik. Support vector regression machines. Advances in neural information processing systems, 9, 1996. URL: https://doi.org/10.5555/2998981.2999003.
Oliver Duerr, Beate Sick, and Elvis Murina. Probabilistic Deep Learning: With Python, Keras and TensorFlow Probability. Manning Publications, 2020. URL: https://tensorchiefs.github.io/dl_book.
Raimund Kirner et al. Fully automatic worst-case execution time analysis for matlab/simulink models. In ECRTS, 2002.
Trevor Harmon et al. Fast, interactive worst-case execution time analysis with back-annotation. IEEE Trans. Industrial Informatics, 8(2), 2012.
C Ferdinand, R Heckmann, D Kästner, K Richter, N Feiertag, and M Jersak. Integration of code-level and system-level timing analysis for early architecture exploration and reliable timing verification. In ERTS2 2010, Embedded Real Time Software & Systems, 2010.
Fernando Fernandes dos Santos, Lucas Draghetti, Lucas Weigel, Luigi Carro, Philippe Navaux, and Paolo Rech. Evaluation and mitigation of soft-errors in neural network-based object detection in three gpu architectures. In 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pages 169-176. IEEE, June 2017. URL: https://doi.org/10.1109/dsn-w.2017.47.
Freescale semicondutor. e6500 Core Reference Manual. https://www.nxp.com/docs/en/reference-manual/E6500RM.pdf, 2014. E6500RM. Rev 0. 06/2014.
Freescale semicondutor. QorIQ T2080 Reference Manual, 2016. Also supports T2081. Doc. No.: T2080RM. Rev. 3, 11/2016.
Jonah Gamba. Automotive Radar Applications, pages 123-142. Springer Singapore, Singapore, 2020. URL: https://doi.org/10.1007/978-981-13-9193-4_9.
Leon A Gatys, Alexander S Ecker, and Matthias Bethge. Image style transfer using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2414-2423, 2016.
P. Giusto, G. Martin, and E. Harcourt. Reliable estimation of execution time of embedded software. In Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001, pages 580-588, 2001. URL: https://doi.org/10.1109/DATE.2001.915082.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. URL: http://www.deeplearningbook.org.
Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735-1780, 1997.
Thomas Huybrechts, Siegfried Mercelis, and Peter Hellinckx. A New Hybrid Approach on WCET Analysis for Real-Time Systems Using Machine Learning. In Florian Brandner, editor, 18th International Workshop on Worst-Case Execution Time Analysis (WCET 2018), volume 63 of OpenAccess Series in Informatics (OASIcs), pages 5:1-5:12, Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: https://doi.org/10.4230/OASIcs.WCET.2018.5.
Javier Jalle, Mikel Fernandez, Jaume Abella, Jan Andersson, Mathieu Patte, Luca Fossati, Marco Zulianello, and Francisco J. Cazorla. Bounding resource-contention interference in the next-generation multipurpose processor (ngmp). In Proceedings of the 8th European Congress on Embedded Real Time Software and Systems (ERTS²), 2016.
Haifeng Jin, Qingquan Song, and Xia Hu. Auto-keras: An efficient neural architecture search system. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1946-1956, 2019. URL: https://doi.org/10.48550/arXiv.1806.10282.
Alex Kendall and Yarin Gal. What uncertainties do we need in bayesian deep learning for computer vision? Neural Information Processing Systems, 2017, pages 5580-5590, 2017.
Roger Koenker and Kevin F Hallock. Quantile regression. Journal of economic perspectives, 15(4):143-156, 2001. URL: https://doi.org/10.1257/jep.15.4.143.
Alex Krizhevsky and Geoff Hinton. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 40(7):1-9, 2010.
Vikash Kumar. Deep neural network approach to estimate early worst-case execution time. In Proceedings of Digital Avionics Systems Conference (DASC), 2021.
Yann LeCun, Yoshua Bengio, et al. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995.
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436-444, 2015. URL: https://www.nature.com/articles/nature14539.
Guohao Li, Guocheng Qian, Itzel C Delgadillo, Matthias Muller, Ali Thabet, and Bernard Ghanem. Sgas: Sequential greedy architecture search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1620-1630, 2020.
Nicolai Meinshausen. Quantile regression forests. Journal of Machine Learning Research, 7:983-999, December 2006. URL: https://doi.org/10.5555/1248547.1248582.
AV Meshcheryakov, VV Glazkova, SV Gerasimov, and IV Mashechkin. Measuring the probabilistic photometric redshifts of x-ray quasars based on the quantile regression of ensembles of decision trees. Astronomy Letters, 44(12):735-753, 2018. URL: https://doi.org/10.1134/S1063773718120058.
Kevin P Murphy. Probabilistic machine learning: an introduction. MIT press, 2022.
Atul Negi and K Rajesh. A review of ai and ml applications for computing systems. In 2019 9th International Conference on Emerging Trends in Engineering and Technology - Signal and Information Processing (ICETET-SIP-19), pages 1-6, 2019. URL: https://doi.org/10.1109/ICETET-SIP-1946815.2019.9092299.
Stefana Nenova and Daniel Kastner. Worst-case timing estimation and architecture exploration in early design phases. In In Niklas Holsti, editor, 9th Intl. Workshop on Worst-Case Execution Time (WCET) Analysis, Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2009.
Jan Nowotsch, Michael Paulitsch, Daniel Buhler, Henrik Theiling, Simon Wegener, and Michael Schmidt. Multi-core interference-sensitive WCET analysis leveraging runtime resource capacity enforcement. In 26th Euromicro Conference on Real-Time Systems, ECRTS, 2014.
Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. Can you trust your model’s uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information processing systems, 32, 2019.
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch, 2017.
Drew Penney and Lizhong Chen. A survey of machine learning applied to computer architecture design. ArXiv, abs/1909.12373, 2019. URL: http://arxiv.org/abs/1909.12373.
Roger Pujol, Hamid Tabani, Leonidas Kosmidis, Enrico Mezzetti, Jaume Abella, and Francisco J Cazorla. Generating and exploiting deep learning variants to increase heterogeneous resource utilization in the nvidia xavier. In 31st Euromicro Conference on Real-Time Systems (ECRTS 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2019.
David Radack, Harold G. Tiedeman, and Paul Parkinson. Civil certification of multi-core processing systems in commercial avionics. Technical report, Rockwell Collins, 2018.
Petar Radojković, Sylvain Girbal, Arnaud Grasset, Eduardo Quiñones, Sami Yehia, and Francisco J Cazorla. On the evaluation of the impact of shared resources in multithreaded cots processors in time-critical environments. ACM Transactions on Architecture and Code Optimization (TACO), 8(4):1-25, 2012.
Jitendra Kumar Rai, Atul Negi, and Rajeev Wankar. Machine learning based performance prediction for multi-core simulation. In Chattrakul Sombattheera, Arun Agarwal, Siba K. Udgata, and Kittichai Lavangnananda, editors, Multi-disciplinary Trends in Artificial Intelligence, pages 236-247, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
Russell Reed and Robert J MarksII. Neural smithing: supervised learning in feedforward artificial neural networks. Mit Press, 1999. URL: https://doi.org/10.7551/mitpress/4937.001.0001.
Shenyuan Ren, Ligang He, Junyu Li, Zhiyan Chen, Peng Jiang, and Chang-Tsun Li. Contention-aware prediction for performance impact of task co-running in multicore computers. Wireless Networks, February 2019. URL: https://doi.org/10.1007/s11276-018-01902-7.
Hamid Tabani, Roger Pujol, Jaume Abella, and Francisco J. Cazorla. A cross-layer review of deep learning frameworks to ease their optimization and reuse. In 2020 IEEE 23rd International Symposium on Real-Time Distributed Computing (ISORC), pages 144-145. IEEE, May 2020. URL: https://doi.org/10.1109/isorc49007.2020.00030.
Lee Teschler. The basics of automotive radar, 2019. URL: https://www.designworldonline.com/the-basics-of-automotive-radar/.
Prathap Kumar Valsan, Heechul Yun, and Farzad Farshchi. Taming non-blocking caches to improve isolation in multicore real-time systems. In 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), Vienna, Austria, April 11-14, 2016, 2016.
Jasper Velthoen, Clément Dombry, Juan-Juan Cai, and Sebastian Engelke. Gradient boosting for extreme quantile regression. arXiv preprint, 2021. URL: http://arxiv.org/abs/2103.00808.
Sergi Vilardell, Isabel Serra, Roberto Santalla, Enrico Mezzetti, Jaume Abella, and Francisco J. Cazorla. HRM: merging hardware event monitors for improved timing analysis of complex mpsocs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 39(11):3662-3673, 2020. URL: https://doi.org/10.1109/TCAD.2020.3013051.
Cort J Willmott and Kenji Matsuura. Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance. Climate research, 30(1):79-82, 2005. URL: https://www.jstor.org/stable/24869236.
Felippe Vieira Zacarias, Rajiv Nishtala, and Paul Carpenter. Contention-aware application performance prediction for disaggregated memory systems. In Proceedings of the 17th ACM International Conference on Computing Frontiers, CF '20, pages 49-59, New York, NY, USA, 2020. Association for Computing Machinery. URL: https://doi.org/10.1145/3387902.3392625.
Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. Recurrent neural network regularization. arXiv preprint, 2014. URL: http://arxiv.org/abs/1409.2329.
Jiacheng Zhao, Huimin Cui, Jingling Xue, and Xiaobing Feng. Predicting cross-core performance interference on multicore processors with regression analysis. IEEE Trans. Parallel Distrib. Syst., 27(5):1443-1456, May 2016. URL: https://doi.org/10.1109/TPDS.2015.2442983.

Using Quantile Regression in Neural Networks for Contention Prediction in Multicore Processors

Authors Axel Brando , Isabel Serra , Enrico Mezzetti , Jaume Abella , Francisco J. Cazorla

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Using Quantile Regression in Neural Networks for Contention Prediction in Multicore Processors

Authors Axel Brando , Isabel Serra , Enrico Mezzetti , Jaume Abella , Francisco J. Cazorla

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message