Using Lock Servers to Scale Real-Time Locking Protocols: Chasing Ever-Increasing Core Counts

Authors Catherine E. Nemitz, Tanya Amert, James H. Anderson



PDF
Thumbnail PDF

File

LIPIcs.ECRTS.2018.25.pdf
  • Filesize: 1.2 MB
  • 24 pages

Document Identifiers

Author Details

Catherine E. Nemitz
  • The University of North Carolina at Chapel Hill, USA
Tanya Amert
  • The University of North Carolina at Chapel Hill, USA
James H. Anderson
  • The University of North Carolina at Chapel Hill, USA

Cite AsGet BibTex

Catherine E. Nemitz, Tanya Amert, and James H. Anderson. Using Lock Servers to Scale Real-Time Locking Protocols: Chasing Ever-Increasing Core Counts. In 30th Euromicro Conference on Real-Time Systems (ECRTS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 106, pp. 25:1-25:24, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ECRTS.2018.25

Abstract

During the past decade, parallelism-related issues have been at the forefront of real-time systems research due to the advent of multicore technologies. In the coming years, such issues will loom ever larger due to increasing core counts. Having more cores means a greater potential exists for platform capacity loss when the available parallelism cannot be fully exploited. In this paper, such capacity loss is considered in the context of real-time locking protocols. In this context, lock nesting becomes a key concern as it can result in transitive blocking chains that force tasks to execute sequentially unnecessarily. Such chains can be quite long on a larger machine. Contention-sensitive real-time locking protocols have been proposed as a means of "breaking" transitive blocking chains, but such protocols tend to have high overhead due to more complicated lock/unlock logic. To ease such overhead, the usage of lock servers is considered herein. In particular, four specific lock-server paradigms are proposed and many nuances concerning their deployment are explored. Experiments are presented that show that, by executing cache hot, lock servers can enable reductions in lock/unlock overhead of up to 86%. Such reductions make contention-sensitive protocols a viable approach in practice.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time systems
  • Computer systems organization → Embedded and cyber-physical systems
  • Software and its engineering → Mutual exclusion
  • Software and its engineering → Real-time systems software
  • Software and its engineering → Synchronization
  • Software and its engineering → Process synchronization
Keywords
  • multiprocess locking protocols
  • nested locks
  • priority-inversion blocking
  • reader/writer locks
  • real-time locking protocols

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. S. Afshar, M. Behnam, R. Bril, and T. Nolte. Flexible spin-lock model for resource sharing in multiprocessor real-time systems. In SIES '14, pages 41-51. IEEE, 2014. URL: http://dx.doi.org/10.1109/SIES.2014.6871185.
  2. S. Afshar, M. Behnam, R. Bril, and T. Nolte. An optimal spin-lock priority assignment algorithm for real-time multi-core systems. In RTCSA '17, pages 1-11. IEEE Computer Society, 2017. URL: http://dx.doi.org/10.1109/RTCSA.2017.8046310.
  3. S. Afshar, M. Behnam, R. Bril, and T. Nolte. Per processor spin-based protocols for multiprocessor real-time systems. Leibniz Transactions on Embedded Systems, 4(2), 2017. URL: http://dx.doi.org/10.4230/LITES-v004-i002-a003.
  4. S. Afshar, F. Nemati, and T. Nolte. Towards resource sharing under multiprocessor semi-partitioned scheduling. In SIES '12, pages 315-318. IEEE, 2012. URL: http://dx.doi.org/10.1109/SIES.2012.6356605.
  5. S. Altmeyer, R. Douma, W. Lunniss, and R. Davis. Evaluation of cache partitioning for hard real-time systems. In ECRTS '14, pages 15-26. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/ECRTS.2014.11.
  6. B. Andersson and A. Easwaran. Provably good multiprocessor scheduling with resource sharing. Real-Time Systems, 46(2):153-159, 2010. Google Scholar
  7. D. Bacon, R. Konuru, C. Murthy, and M. Serrano. Thin locks: Featherweight synchronization for java. In PLDI '98, pages 258-268, 1998. URL: http://dx.doi.org/10.1145/277650.277734.
  8. A. Biondi and B. Brandenburg. Lightweight real-time synchronization under P-EDF on symmetric and asymmetric multiprocessors. In ECRTS '16, pages 39-49. IEEE Computer Society, 2016. URL: http://dx.doi.org/10.1109/ECRTS.2016.30.
  9. A. Block, H. Leontyev, B. Brandenburg, and J. Anderson. A flexible real-time locking protocol for multiprocessors. In RTCSA '07, pages 47-56. IEEE Computer Society, 2007. URL: http://dx.doi.org/10.1109/RTCSA.2007.8.
  10. B. Brandenburg. Scheduling and Locking in Multiprocessor Real-Time Operating Systems. PhD thesis, University of North Carolina, Chapel Hill, NC, 2011. Google Scholar
  11. B. Brandenburg. Improved analysis and evaluation of real-time semaphore protocols for P-FP scheduling. In RTAS '13, pages 141-152. IEEE Computer Society, 2013. URL: http://dx.doi.org/10.1109/RTAS.2013.6531087.
  12. B. Brandenburg. The FMLP+: An asymptotically optimal real-time locking protocol for suspension-aware analysis. In ECRTS '14, pages 61-71. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/ECRTS.2014.26.
  13. B. Brandenburg and J. Anderson. Feather-trace: A lightweight event tracing toolkit. In OSPERT '07, 2007. Google Scholar
  14. B. Brandenburg and J. Anderson. A comparison of the M-PCP, D-PCP, and FMLP on LITMUS^RT. In OPODIS '08, pages 105-124, 2008. URL: http://dx.doi.org/10.1007/978-3-540-92221-6_9.
  15. B. Brandenburg and J. Anderson. An implementation of the PCP, SRP, D-PCP, M-PCP, and FMLP real-time synchronization protocols in LITMUS^RT. In RTCSA '08, pages 185-194, 2008. URL: http://dx.doi.org/10.1109/RTCSA.2008.13.
  16. B. Brandenburg and J. Anderson. Optimality results for multiprocessor real-time locking. In RTSS '10, pages 49-60. IEEE Computer Society, 2010. URL: http://dx.doi.org/10.1109/RTSS.2010.17.
  17. B. Brandenburg and J. Anderson. Spin-based reader-writer synchronization for multiprocessor real-time systems. Real-Time Systems, 46(1), 2010. Google Scholar
  18. B. Brandenburg and J. Anderson. Real-time resource-sharing under clustered scheduling: Mutex, reader-writer, and k-exclusion locks. In EMSOFT '11, pages 69-78. ACM, 2011. URL: http://dx.doi.org/10.1145/2038642.2038655.
  19. B. Brandenburg and J. Anderson. The OMLP family of optimal multiprocessor real-time locking protocols. Design Automation for Embedded Systems, 17(2):277-342, 2013. Google Scholar
  20. B. Brandenburg, J. Calandrino, A. Block, H. Leontyev, and J. Anderson. Real-time synchronization on multiprocessors: To block or not to block, to suspend or spin? In RTAS '08, pages 342-353. IEEE Computer Society, 2008. URL: http://dx.doi.org/10.1109/RTAS.2008.27.
  21. A. Burns and A. Wellings. A schedulability compatible multiprocessor resource sharing protocol - MrsP. In ECRTS '13, pages 282-291. IEEE Computer Society, 2013. URL: http://dx.doi.org/10.1109/ECRTS.2013.37.
  22. M. Campoy, A.P. Ivars, and J.V. Busquets-Mataix. Static use of locking caches in multitask preemptive real-time systems. In IEEE/IEE Real-Time Embedded Systems Workshop '01, 2001. Google Scholar
  23. Y. Chang, R. Davis, and A. Wellings. Reducing queue lock pessimism in multiprocessor schedulability analysis. In RTNS '10, 2010. Google Scholar
  24. C. Chen and S. Tripathi. Multiprocessor priority ceiling based protocols. Dept. of Computer Science, Univ. of Maryland. Technical report, CS-TR-3252, April, 1994. Google Scholar
  25. M. Chisholm, B. Ward, N. Kim, and J. Anderson. Cache sharing and isolation tradeoffs in multicore mixed-criticality systems. In RTSS '15, pages 305-316. IEEE Computer Society, 2015. URL: http://dx.doi.org/10.1109/RTSS.2015.36.
  26. T. Craig. Queuing spin lock algorithms to support timing predictability. In RTSS '93, pages 148-157. IEEE Computer Society, 1993. URL: http://dx.doi.org/10.1109/REAL.1993.393505.
  27. R. Davis and A. Burns. Resource sharing in hierarchical fixed priority pre-emptive systems. In RTSS '06, pages 257-270. IEEE Computer Society, 2006. URL: http://dx.doi.org/10.1109/RTSS.2006.42.
  28. U. Devi, H. Leontyev, and J. Anderson. Efficient synchronization under global EDF scheduling on multiprocessors. In ECRTS '06, pages 75-84. IEEE Computer Society, 2006. URL: http://dx.doi.org/10.1109/ECRTS.2006.10.
  29. E. Dijkstra. Two starvation free solutions to a general exclusion problem. EWD 625, Plataanstraat 5, 5671 Al Nuenen, The Netherlands. Google Scholar
  30. A. Easwaran and B. Andersson. Resource sharing in global fixed-priority preemptive multiprocessor scheduling. In RTSS '09, pages 377-386, 2009. URL: http://dx.doi.org/10.1109/RTSS.2009.37.
  31. G. Elliott and J. Anderson. An optimal k-exclusion real-time locking protocol motivated by multi-GPU systems. Real-Time Systems, 49(2):140-170, 2013. Google Scholar
  32. D. Faggioli, G. Lipari, and T. Cucinotta. The multiprocessor bandwidth inheritance protocol. In ECRTS '10, pages 90-99, 2010. URL: http://dx.doi.org/10.1109/ECRTS.2010.19.
  33. D. Faggioli, G. Lipari, and T. Cucinotta. Analysis and implementation of the multiprocessor bandwidth inheritance protocol. Real-Time Systems, 48(6), 2012. Google Scholar
  34. P. Gai, M. Di Natale, G. Lipari, A. Ferrari, C. Gabellini, and P. Marceca. A comparison of MPCP and MSRP when sharing resources in the Janus multiple-processor on a chip platform. In RTAS '03, page 189, 2003. URL: http://dx.doi.org/10.1109/RTTAS.2003.1203051.
  35. P. Gai, G. Lipari, and M. Di Natale. Minimizing memory utilization of real-time task sets in single and multi-processor systems-on-a-chip. In RTSS '01, pages 73-83. IEEE Computer Society, 2001. URL: http://dx.doi.org/10.1109/REAL.2001.990598.
  36. J. Garrido, S. Zhao, A. Burns, and A. Wellings. Supporting nested resources in MrsP. In Ada-Europe International Conference on Reliable Software Technologies '17, volume 10300 of Lecture Notes in Computer Science, pages 73-86. Springer, 2017. URL: http://dx.doi.org/10.1007/978-3-319-60588-3_5.
  37. J. Han, D. Zhu, X. Wu, L. Yang, and H. Jin. Multiprocessor real-time systems with shared resources: Utilization bound and mapping. IEEE Transactions on Parallel and Distributed Systems, 2014. Google Scholar
  38. J. Havender. Avoiding deadlock in multitasking systems. IBM systems journal, 7(2):74-84, 1968. Google Scholar
  39. M. Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems, 13(1):124-149, 1991. Google Scholar
  40. J. Herter, P. Backes, F. Haupenthal, and J. Reineke. CAMA: A predictable cache-aware memory allocator. In ECRTS '11, pages 23-32. IEEE Computer Society, 2011. URL: http://dx.doi.org/10.1109/ECRTS.2011.11.
  41. P. Hsiu, D. Lee, and T. Kuo. Task synchronization and allocation for many-core real-time systems. In EMSOFT '11, pages 79-88. ACM, 2011. URL: http://dx.doi.org/10.1145/2038642.2038656.
  42. W. Huang, M. Yang, and J. Chen. Resource-oriented partitioned scheduling in multiprocessor systems: How to partition and how to share? In RTSS '16, pages 111-122. IEEE Computer Society, 2016. URL: http://dx.doi.org/10.1109/RTSS.2016.020.
  43. C. Jarrett, B. Ward, and J. Anderson. A contention-sensitive fine-grained locking protocol for multiprocessor real-time systems. In RTNS '15, pages 3-12. ACM, 2015. URL: http://dx.doi.org/10.1145/2834848.2834874.
  44. Y. Joung. Asynchronous group mutual exclusion. Distributed Computing, 13(4):189-206, 2000. Google Scholar
  45. P. Keane and M. Moir. A simple local-spin group mutual exclusion algorithm. In PODC '99, pages 23-32, 1999. URL: http://dx.doi.org/10.1145/301308.301319.
  46. H. Kim, A. Kandhalu, and R. Rajkumar. A coordinated approach for practical OS-level cache management in multi-core real-time systems. In ECRTS '13, pages 80-89. IEEE Computer Society, 2013. URL: http://dx.doi.org/10.1109/ECRTS.2013.19.
  47. D. Kirk and J. Strosnider. SMART (strategic memory allocation for real-time) cache design using the MIPS R3000. In RTSS '90, pages 322-330. IEEE Computer Society, 1990. URL: http://dx.doi.org/10.1109/REAL.1990.128764.
  48. L. Kontothanassis, R. Wisniewski, and M. Scott. Scheduler-conscious synchronization. ACM Transactions on Computer Systems (TOCS), 15(1):3-40, 1997. Google Scholar
  49. K. Lakshmanan, D. Niz, and R. Rajkumar. Coordinated task scheduling, allocation and synchronization on multiprocessors. In RTSS '09, pages 469-478. IEEE Computer Society, 2009. URL: http://dx.doi.org/10.1109/RTSS.2009.51.
  50. J. Lozi, F. David, G. Thomas, J. Lawall, and G. Muller. Remote core locking: migrating critical-section execution to improve the performance of multithreaded applications. In USENIX ATC'12, pages 65-76. USENIX Association, 2012. URL: https://www.usenix.org/conference/atc12/technical-sessions/presentation/lozi.
  51. G. Macariu and V. Cretu. Limited blocking resource sharing for global multiprocessor scheduling. In ECRTS '11, pages 262-271. IEEE Computer Society, 2011. URL: http://dx.doi.org/10.1109/ECRTS.2011.32.
  52. J. Mellor-Crummey and M. Scott. Algorithms for scalable synchronization of shared-memory multiprocessors. Transactions on Computer Systems, 9(1), 1991. Google Scholar
  53. F. Nemati, M. Behnam, and T. Nolte. Independently-developed real-time systems on multi-cores with shared resources. In ECRTS '11, pages 251-261. IEEE Computer Society, 2011. URL: http://dx.doi.org/10.1109/ECRTS.2011.31.
  54. F. Nemati, T. Nolte, and M. Behnam. Partitioning real-time systems on multiprocessors with shared resources. In OPODIS '10, volume 6490 of Lecture Notes in Computer Science, pages 253-269. Springer, 2010. URL: http://dx.doi.org/10.1007/978-3-642-17653-1_20.
  55. C. Nemitz, T. Amert, and J. Anderson. Real-time multiprocessor locks with nesting: Optimizing the common case. In RTNS '17, pages 38-47. ACM, 2017. URL: http://dx.doi.org/10.1145/3139258.3139262.
  56. C. Nemitz, T. Amert, and J. Anderson. Using lock servers to scale real-time locking protocols: Chasing ever-increasing core counts (extended version), 2018. URL: http://www.cs.unc.edu/~anderson/papers.html.
  57. R. Rajkumar. Real-time synchronization protocols for shared memory multiprocessors. In ICDCS '90, pages 116-123. IEEE Computer Society, 1990. URL: https://doi.org/10.1109/ICDCS.1990.89257, URL: http://dx.doi.org/10.1109/ICDCS.1990.89257.
  58. R. Rajkumar. Synchronization in Real-Time Systems: A Priority Inheritance Approach. Kluwer Academic Publishers, 1991. Google Scholar
  59. R. Rajkumar, L. Sha, and J. Lehoczky. Real-time synchronization protocols for multiprocessors. In RTSS '88, pages 259-269, 1988. URL: http://dx.doi.org/10.1109/REAL.1988.51121.
  60. H. Takada and K. Sakamura. Real-time scalability of nested spin locks. In RTCSA '95, pages 160-167, 1995. URL: https://doi.org/10.1109/RTCSA.1995.528766, URL: http://dx.doi.org/10.1109/RTCSA.1995.528766.
  61. C. Wang, H. Takada, and K. Sakamura. Priority inheritance spin locks for multiprocessor real-time systems. In ISPAN '96, pages 70-76. IEEE Computer Society, 1996. URL: http://dx.doi.org/10.1109/ISPAN.1996.508963.
  62. B. Ward. Sharing Non-Processor Resources in Multiprocessor Real-Time Systems. PhD thesis, University of North Carolina, Chapel Hill, NC, 2016. Google Scholar
  63. B. Ward and J. Anderson. Supporting nested locking in multiprocessor real-time systems. In ECRTS '12, pages 223-232. IEEE Computer Society, 2012. URL: http://dx.doi.org/10.1109/ECRTS.2012.17.
  64. B. Ward and J. Anderson. Fine-grained multiprocessor real-time locking with improved blocking. In RTNS '13, pages 67-76. ACM, 2013. URL: http://dx.doi.org/10.1145/2516821.2516843.
  65. B. Ward and J. Anderson. Multi-resource real-time reader/writer locks for multiprocessors. In IPDPS '14, pages 177-186. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/IPDPS.2014.29.
  66. B. Ward, J. Herman, C. Kenna, and J. Anderson. Making shared caches more predictable on multicore platforms. In ECRTS '13, pages 157-167. IEEE Computer Society, 2013. URL: http://dx.doi.org/10.1109/ECRTS.2013.26.
  67. A. Wieder and B. Brandenburg. On spin locks in AUTOSAR: Blocking analysis of FIFO, unordered, and priority-ordered spin locks. In RTSS '13, pages 45-56. IEEE Computer Society, 2013. URL: http://dx.doi.org/10.1109/RTSS.2013.13.
  68. A. Wieder and B. Brandenburg. On the complexity of worst-case blocking analysis of nested critical sections. In RTSS '14, pages 106-117. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/RTSS.2014.34.
  69. M. Xu, L. T. X. Phan, H.-Y. Choi, and I. Lee. Analysis and implementation of global preemptive fixed-priority scheduling with dynamic cache allocation. In RTAS '16, 2016. Google Scholar
  70. M. Xu, L. T. X. Phan, H.-Y. Choi, and I. Lee. vCAT: Dynamic cache management using CAT virtualization. In RTAS '17, pages 211-222, 2017. URL: http://dx.doi.org/10.1109/RTAS.2017.15.
  71. M. Yang, A. Wieder, and B. Brandenburg. Global real-time semaphore protocols: A survey, unified analysis, and comparison. In RTSS '15, pages 1-12. IEEE Computer Society, 2015. URL: http://dx.doi.org/10.1109/RTSS.2015.8.
  72. H. Yun, R. Mancuso, Z. Wu, and R. Pellizzoni. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In RTAS '14, pages 155-166. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/RTAS.2014.6925999.
  73. S. Zhao, J. Garrido, A. Burns, and A. Wellings. New schedulability analysis for MrsP. In RTCSA '17, pages 1-10. IEEE Computer Society, 2017. URL: http://dx.doi.org/10.1109/RTCSA.2017.8046311.