Energy-Aware HEVC Software Decoding On Mobile Heterogeneous Multi-Cores Architectures

Authors Mohammed Bey Ahmed Khernache , Jalil Boukhobza , Yahia Benmoussa, Daniel Menard



PDF
Thumbnail PDF

File

OASIcs.PARMA-DITAM.2022.4.pdf
  • Filesize: 0.66 MB
  • 13 pages

Document Identifiers

Author Details

Mohammed Bey Ahmed Khernache
  • Univ. Bretagne-Sud, UMR 6285, Lab-STICC, France
Jalil Boukhobza
  • Lab-STICC UMR CNRS 6285, ENSTA Bretagne, France
Yahia Benmoussa
  • Univ. M’hamed Bougara, LMSS, Algeria
Daniel Menard
  • INSA de Rennes, UMR CNRS 6164 IETR Image Group, France

Acknowledgements

This work was supported by BPI France, Cap Digital, and Région Ile de France through the French project EFIGI.

Cite AsGet BibTex

Mohammed Bey Ahmed Khernache, Jalil Boukhobza, Yahia Benmoussa, and Daniel Menard. Energy-Aware HEVC Software Decoding On Mobile Heterogeneous Multi-Cores Architectures. In 13th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 11th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM 2022). Open Access Series in Informatics (OASIcs), Volume 100, pp. 4:1-4:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)
https://doi.org/10.4230/OASIcs.PARMA-DITAM.2022.4

Abstract

Video content is becoming increasingly omnipresent on mobile platforms thanks to advances in mobile heterogeneous architectures. These platforms typically include limited rechargeable batteries which do not improve as fast as video content. Most state-of-the-art studies proposed solutions based on parallelism to exploit the GPP heterogeneity and DVFS to scale up/down the GPP frequency based on the video workload. However, some studies assume to have information about the workload before to start decoding. Others do not exploit the asymmetry character of recent mobile architectures. To address these two challenges, we propose a solution based on classification and frequency scaling. First, a model to classify frames based on their type and size is built during design-time. Second, this model is applied for each frame to decide which GPP cores will decode it. Third, the frequency of the chosen GPP cores is dynamically adjusted based on the output buffer size. Experiments on real-world mobile platforms show that the proposed solution can save more than 20% of energy (mJ/Frame) compared to the Ondemand Linux governor with less than 5% of miss-rate. Moreover, it needs less than one second of decoding to enter the stable state and the overhead represents less than 1% of the frame decoding time.

Subject Classification

ACM Subject Classification
  • Hardware → Platform power issues
  • Hardware → Chip-level power issues
  • Computing methodologies → Classification and regression trees
  • Computer systems organization → Multicore architectures
Keywords
  • energy consumption
  • mobile platform
  • heterogeneous architecture
  • software video decoding
  • hardware video decoding
  • HEVC

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. APQ8094 | Qualcomm. URL: https://www.qualcomm.com/products/apq8094.
  2. Cisco Visual Networking Index: Forecast and Trends, 2017–2022 White Paper - Cisco. URL: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/white-paper-c11-741490.html.
  3. Development of the VPU | Jon Peddie Research. URL: https://www.jonpeddie.com/blog/development-of-the-vpu/.
  4. Download FFmpeg. URL: https://ffmpeg.org/download.html.
  5. GitHub - OpenHEVC/openHEVC at ffmpeg_update. URL: https://github.com/OpenHEVC/openHEVC/tree/ffmpeg_update.
  6. Jellyfish Bitrate Test Files. URL: http://jell.yfish.us/.
  7. Odroid-xu3 – odroid. URL: https://www.hardkernel.com/shop/odroid-xu3/.
  8. Qualcomm® Robotics RB3 Development Kit. URL: https://www.qualcomm.com/products/qualcomm-robotics-rb3-platform.
  9. scikit-learn: machine learning in python - scikit-learn 0.24.2 documentation. URL: https://scikit-learn.org/stable/.
  10. Video usage is soaring. will it last? URL: https://newsroom.cisco.com/feature-content?type=webcontent&articleId=2080343.
  11. F. Amish and E.B. Bourennane. Fully pipelined real time hardware solution for high efficiency video coding (hevc) intra prediction. Journal of Systems Architecture, 64:133-147, 2016. Real-Time Signal Processing in Embedded Systems. Google Scholar
  12. H. Baik and H. Song. A complexity-based adaptive tile partitioning algorithm for hevc decoder parallelization. In 2015 IEEE International Conference on Image Processing (ICIP), pages 4298-4302, 2015. URL: https://doi.org/10.1109/ICIP.2015.7351617.
  13. A.C. Bavier and A.B. Montzand L.L. Peterson. Predicting mpeg execution times. SIGMETRICS Perform. Eval. Rev., 26(1):131-140, June 1998. Google Scholar
  14. M. Bey Ahmed Khernache, Y. Benmoussa, J. Boukhobza, and D. Menard. Hevc hardware vs software decoding: An objective energy consumption analysis and comparison. Journal of Systems Architecture, 115:102004, 2021. Google Scholar
  15. F. Bossen. Common test conditions and software reference configurations. JCTVC-L1100, 12, 2013. Google Scholar
  16. T.D. Burd and R.W. Brodersen. Energy efficient cmos microprocessor design. In Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences, volume 1, pages 288-297 vol.1, January 1995. URL: https://doi.org/10.1109/HICSS.1995.375385.
  17. C. C. Chi, M. Alvarez-Mesa, B. Juurlink, G. Clare, F. Henry, S. Pateux, and T. Schierl. Parallel scalability and efficiency of hevc parallelization approaches. IEEE Transactions on Circuits and Systems for Video Technology, 22(12):1827-1838, 2012. URL: https://doi.org/10.1109/TCSVT.2012.2223056.
  18. C.C. Chi, M. Alvarez-Mesa, and B. Juurlink. Low-power high-efficiency video decoding using general-purpose processors. ACM Trans. Archit. Code Optim., 11(4), January 2015. Google Scholar
  19. K. Choi and E.S. Jang. Leveraging parallel computing in modern video coding standards. IEEE MultiMedia, 19(3):7-11, July 2012. URL: https://doi.org/10.1109/MMUL.2012.36.
  20. Y. Duan, J. Sun, L. Yan, K. Chen, and Z. Guo. Novel efficient hevc decoding solution on general-purpose processors. IEEE Transactions on Multimedia, 16(7):1915-1928, 2014. URL: https://doi.org/10.1109/TMM.2014.2337834.
  21. J. Golston, S. Arora, and R. Reddy. Optimized video decoder architecture for TMS320C64x DSP generation. In Bhaskaran Vasudev, T. Russell Hsing, Andrew G. Tescher, and Touradj Ebrahimi, editors, Image and Video Communications and Processing 2003, volume 5022, pages 719-726. International Society for Optics and Photonics, SPIE, 2003. Google Scholar
  22. R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B.C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News, 38(3):37-47, June 2010. Google Scholar
  23. M. Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10-14, February 2014. Google Scholar
  24. C. Im, S. Ha, and H. Kim. Dynamic voltage scheduling with buffers in low-power multimedia applications. ACM Trans. Embed. Comput. Syst., 3(4):686-705, November 2004. Google Scholar
  25. N.A. Kudryashov. Logistic function as solution of many nonlinear differential equations. Applied Mathematical Modelling, 39(18):5733-5742, 2015. Google Scholar
  26. L. Li, C. Sau, T. Fanni, J. Li, T. Viitanen, F. Christophe, F. Palumbo, L. Raffo, H. Huttunen, J. Takala, and S.S. Bhattacharyya. An integrated hardware/software design methodology for signal processing systems. Journal of Systems Architecture, 93:1-19, 2019. Google Scholar
  27. Z. Lu, J. Lach, M. Stan, and K. Skadron. Reducing multimedia decode power using feedback control. In Proceedings 21st International Conference on Computer Design, pages 489-496, 2003. URL: https://doi.org/10.1109/ICCD.2003.1240945.
  28. B. Moyer and Y. Watanabe. Chapter 13 - hardware accelerators. In Bryon Moyer, editor, Real World Multicore Embedded Systems, pages 447-480. Newnes, Oxford, 2013. Google Scholar
  29. E. Nogues, R. Berrada, M. Pelcat, D. Menard, and E. Raffin. A dvfs based hevc decoder for energy-efficient software implementation on embedded processors. In 2015 IEEE International Conference on Multimedia and Expo (ICME), pages 1-6, 2015. URL: https://doi.org/10.1109/ICME.2015.7177406.
  30. E. Nogues, D. Menard, and M. Pelcat. Algorithmic-level approximate computing applied to energy efficient hevc decoding. IEEE Transactions on Emerging Topics in Computing, 7(1):5-17, 2019. URL: https://doi.org/10.1109/TETC.2016.2593644.
  31. E. Nogues, A. Mercat, F. Arrestier, M. Pelcat, and D. Menard. Convex energy optimization of streaming applications for mpsocs. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1557-1561, 2019. URL: https://doi.org/10.1109/ICASSP.2019.8682317.
  32. J.R. Ohm, G.J. Sullivan, H. Schwarz, T.K. Tan, and T. Wiegand. Comparison of the coding efficiency of video coding standards-including high efficiency video coding (hevc). IEEE Transactions on Circuits and Systems for Video Technology, 22:1669-1684, 2012. Google Scholar
  33. Venkatesh Pallipadi and Alexey Starikovskiy. The ondemand governor. In Proceedings of the linux symposium, volume 2(00216), pages 215-230, 2006. Google Scholar
  34. R. Rodríguez-Sánchez and E.S. Quintana-Ortí. Architecture-aware optimization of an hevc decoder on asymmetric multicore processors. Journal of Real-Time Image Processing, 13:25-38, March 2017. Google Scholar
  35. H.J. Roh, S.W. Han, and E.S. Ryu. Prediction complexity-based hevc parallel processing for asymmetric multicores. Multimedia Tools and Applications, 76:25271-25284, December 2017. Google Scholar
  36. R. Sjoberg, Y. Chen, A. Fujibayashi, M.M. Hannuksela, J. Samuelsson, T.K. Tan, Y. Wang, and S. Wenger. Overview of hevc high-level syntax and reference picture management. IEEE Transactions on Circuits and Systems for Video Technology, 22(12):1858-1870, December 2012. URL: https://doi.org/10.1109/TCSVT.2012.2223052.
  37. M. Tikekar, C. Huang, C. Juvekar, V. Sze, and A. P. Chandrakasan. A 249-mpixel/s hevc video-decoder chip for 4k ultra-hd applications. IEEE Journal of Solid-State Circuits, 49(1):61-72, January 2014. Google Scholar
  38. Xiph.org. Xiph.org :: Derf’s Test Media Collection. URL: http://media.xiph.org/video/derf/.
  39. K. Xu, T.M. Liu, J.I. Guo, and C.S. Choy. Methods for power/throughput/area optimization of h.264/avc decoding. Journal of Signal Processing Systems, 60(1):131-145, July 2010. URL: https://doi.org/10.1007/s11265-009-0408-6.
  40. F. Yao, A. Demers, and S. Shenker. A scheduling model for reduced cpu energy. In Proceedings of IEEE 36th Annual Foundations of Computer Science, pages 374-382, 1995. URL: https://doi.org/10.1109/SFCS.1995.492493.
  41. S. Yoo and E.S. Ryu. Parallel hevc decoding with asymmetric mobile multicores. Multimedia Tools and Applications, 76:17337-17352, August 2017. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail