Turning Futexes Inside-Out: Efficient and Deterministic User Space Synchronization Primitives for Real-Time Systems with IPCP

Author Alexander Zuepke

Thumbnail PDF


  • Filesize: 490 kB
  • 23 pages

Document Identifiers

Author Details

Alexander Zuepke
  • RheinMain University of Applied Sciences, Wiesbaden, Germany

Cite AsGet BibTex

Alexander Zuepke. Turning Futexes Inside-Out: Efficient and Deterministic User Space Synchronization Primitives for Real-Time Systems with IPCP. In 32nd Euromicro Conference on Real-Time Systems (ECRTS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 165, pp. 11:1-11:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


In Linux and other operating systems, futexes (fast user space mutexes) are the underlying synchronization primitives to implement POSIX synchronization mechanisms, such as blocking mutexes, condition variables, and semaphores. Futexes allow one to implement mutexes with excellent performance by avoiding system calls in the fast path. However, futexes are fundamentally limited to synchronization mechanisms that are expressible as atomic operations on 32-bit variables. At operating system kernel level, futex implementations require complex mechanisms to look up internal wait queues making them susceptible to determinism issues. In this paper, we present an alternative design for futexes by completely moving the complexity of wait queue management from the operating system kernel into user space, i. e. we turn futexes "inside out". The enabling mechanisms for "inside-out futexes" are an efficient implementation of the immediate priority ceiling protocol (IPCP) to achieve non-preemptive critical sections in user space, spinlocks for mutual exclusion, and interwoven services to suspend or wake up threads. The design allows us to implement common thread synchronization mechanisms in user space and to move determinism concerns out of the kernel while keeping the performance properties of futexes. The presented approach is suitable for multi-processor real-time systems with partitioned fixed-priority (P-FP) scheduling on each processor. We evaluate the approach with an implementation for mutexes and condition variables in a real-time operating system (RTOS). Experimental results on 32-bit ARM platforms show that the approach is feasible, and overheads are driven by low-level synchronization primitives.

Subject Classification

ACM Subject Classification
  • Computer systems organization → Real-time operating systems
  • Software and its engineering → Mutual exclusion
  • Futex
  • Immediate Priority Ceiling Protocol
  • Critical Section
  • Monitor


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Samy Al-Bahra. Nonblocking algorithms and scalable multicore programming. Commun. ACM, 56(7):50-61, 2013. URL: https://doi.org/10.1145/2483852.2483866.
  2. Hesham Almatary, Neil C. Audsley, and Alan Burns. Reducing the implementation overheads of IPCP and DFP. In 2015 IEEE Real-Time Systems Symposium, RTSS 2015, San Antonio, Texas, USA, December 1-4, 2015, pages 295-304. IEEE Computer Society, 2015. URL: https://doi.org/10.1109/RTSS.2015.35.
  3. David F. Bacon, Ravi Konuru, Chet Murthy, and Mauricio Serrano. Thin locks: Featherweight synchronization for java. In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation, PLDI '98, pages 258-268, New York, NY, USA, 1998. ACM. URL: https://doi.org/10.1145/277650.277734.
  4. Andrew Birrell, John V. Guttag, James J. Horning, and Roy Levin. Synchronization primitives for a multiprocessor: A formal specification. In Les Belady, editor, Proceedings of the Eleventh ACM Symposium on Operating System Principles, SOSP 1987, Stouffer Austin Hotel, Austin, Texas, USA, November 8-11, 1987, pages 94-102. ACM, 1987. URL: https://doi.org/10.1145/41457.37509.
  5. David L. Black. Scheduling Support for Concurrency and Parallelism in the Mach Operating System. IEEE Computer, 23(5):35-43, 1990. URL: https://doi.org/10.1109/2.53353.
  6. Bernard Blackham, Yao Shi, Sudipta Chattopadhyay, Abhik Roychoudhury, and Gernot Heiser. Timing analysis of a protected operating system kernel. In Proceedings of the 32nd IEEE Real-Time Systems Symposium, RTSS 2011, Vienna, Austria, November 29 - December 2, 2011, pages 339-348. IEEE Computer Society, 2011. URL: https://doi.org/10.1109/RTSS.2011.38.
  7. Björn B. Brandenburg. Scheduling and Locking in Multiprocessor Real-Time Operating Systems. PhD thesis, The University of North Carolina at Chapel Hill, 2011. Google Scholar
  8. Neil Brown. In pursuit of faster futexes. LWN, May 2016. URL: https://lwn.net/Articles/685769/.
  9. Davidlohr Bueso and Scott Norton. An Overview of Kernel Lock Improvements. LinuxCon North America, Chicago, IL, August 2014. URL: http://events17.linuxfoundation.org/sites/events/files/slides/linuxcon-2014-locking-final.pdf.
  10. Peter A. Buhr, Michel Fortier, and Michael H. Coffin. Monitor classification. ACM Comput. Surv., 27(1):63-107, March 1995. URL: https://doi.org/10.1145/214037.214100.
  11. Alan Burns, Marina Gutiérrez, Mario Aldea Rivas, and Michael González Harbour. A deadline-floor inheritance protocol for EDF scheduled embedded real-time systems with resource sharing. IEEE Trans. Computers, 64(5):1241-1253, 2015. URL: https://doi.org/10.1109/TC.2014.2322619.
  12. Alan Burns and Andy J. Wellings. Real-Time Systems and Programming Languages: Ada, Real-Time Java and C/Real-Time POSIX. Addison-Wesley Educational Publishers Inc, USA, 4th edition, 2009. Google Scholar
  13. Jan Edler, Jim Lipkis, and Edith Schonberg. Process management for highly parallel UNIX systems. In Proceedings of the USENIX Workshop on Unix and Supercomputers, pages 1-17, September 1988. Google Scholar
  14. Hubertus Franke, Rusty Russell, and Matthew Kirkwood. Fuss, Futexes and Furwocks: Fast Userlevel Locking in Linux. In Proceedings of the 2002 Ottawa Linux Symposium, pages 479-495, 2002. Google Scholar
  15. Paolo Gai, Giuseppe Lipari, and Marco Di Natale. Minimizing memory utilization of real-time task sets in single and multi-processor systems-on-a-chip. In Proceedings of the 22nd IEEE Real-Time Systems Symposium (RTSS 2001), London, UK, 2-6 December 2001, pages 73-83. IEEE Computer Society, 2001. URL: https://doi.org/10.1109/REAL.2001.990598.
  16. Darren Hart and Dinakar Guniguntalay. Requeue-PI: Making Glibc Condvars PI-Aware. In Eleventh Real-Time Linux Workshop, pages 215-227, 2009. Google Scholar
  17. Philip Holman and James H. Anderson. Locking in Pfair-scheduled multiprocessor systems. In Proceedings of the 23rd IEEE Real-Time Systems Symposium (RTSS'02), Austin, Texas, USA, December 3-5, 2002, pages 149-158. IEEE Computer Society, 2002. URL: https://doi.org/10.1109/REAL.2002.1181570.
  18. James Leslie Keedy. An outline of the ICL 2900 series system architecture. Australian Computer Journal, 9(2):53-62, 1977. Google Scholar
  19. Paul Kocher, Jann Horn, Anders Fogh, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, and Yuval Yarom. Spectre attacks: Exploiting speculative execution. In 2019 IEEE Symposium on Security and Privacy, SP 2019, San Francisco, CA, USA, May 19-23, 2019, pages 1-19. IEEE, 2019. URL: https://doi.org/10.1109/SP.2019.00002.
  20. Leonidas I. Kontothanassis, Robert W. Wisniewski, and Michael L. Scott. Scheduler-conscious synchronization. ACM Trans. Comput. Syst., 15(1):3-40, 1997. URL: https://doi.org/10.1145/244764.244765.
  21. Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Anders Fogh, Jann Horn, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg. Meltdown: Reading kernel memory from user space. In William Enck and Adrienne Porter Felt, editors, 27th USENIX Security Symposium, USENIX Security 2018, Baltimore, MD, USA, August 15-17, 2018, pages 973-990. USENIX Association, 2018. URL: https://www.usenix.org/conference/usenixsecurity18/presentation/lipp.
  22. Brian D. Marsh, Michael L. Scott, Thomas J. LeBlanc, and Evangelos P. Markatos. First-class user-level threads. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, SOSP ’91, page 110–121, New York, NY, USA, 1991. Association for Computing Machinery. URL: https://doi.org/10.1145/121132.344329.
  23. Ross McIlroy, Jaroslav Sevcík, Tobias Tebbi, Ben L. Titzer, and Toon Verwaest. Spectre is here to stay: An analysis of side-channels and speculative execution. CoRR, abs/1902.05178, 2019. URL: http://arxiv.org/abs/1902.05178.
  24. John M. Mellor-Crummey and Michael L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst., 9(1):21-65, 1991. URL: https://doi.org/10.1145/103727.103729.
  25. Maged M. Michael and Michael L. Scott. Nonblocking algorithms and preemption-safe locking on multiprogrammed shared memory multiprocessors. J. Parallel Distrib. Comput., 51(1):1-26, 1998. URL: https://doi.org/10.1006/jpdc.1998.1446.
  26. John K. Ousterhout. Scheduling techniques for concurrent systems. In Proceedings of the 3rd International Conference on Distributed Computing Systems, Miami/Ft. Lauderdale, Florida, USA, October 18-22, 1982, pages 22-30. IEEE Computer Society, 1982. Google Scholar
  27. John K. Ousterhout. Why aren't operating systems getting faster as fast as hardware? In Proceedings of the Usenix Summer 1990 Technical Conference, Anaheim, California, USA, June 1990, pages 247-256. USENIX Association, 1990. Google Scholar
  28. Filip Pizlo. Locking in webkit. online article, May 2016. URL: https://webkit.org/blog/6161/locking-in-webkit/.
  29. R. Rajkumar. Real-time synchronization protocols for shared memory multiprocessors. In 10th International Conference on Distributed Computing Systems (ICDCS 1990), May 28 - June 1, 1990, Paris, France, pages 116-123. IEEE Computer Society, 1990. URL: https://doi.org/10.1109/ICDCS.1990.89257.
  30. David P. Reed and Rajendra K. Kanodia. Synchronization with eventcounts and sequencers. Commun. ACM, 22(2):115–123, February 1979. URL: https://doi.org/10.1145/359060.359076.
  31. Benoit Schillings. Be Engineering Insights: Benaphores. Be Newsletters, 1(26), May 1996. Google Scholar
  32. Lui Sha, Ragunathan Rajkumar, and John P. Lehoczky. Priority inheritance protocols: An approach to real-time synchronization. IEEE Trans. Computers, 39(9):1175-1185, 1990. URL: https://doi.org/10.1109/12.57058.
  33. Livio Soares and Michael Stumm. Flexsc: Flexible system call scheduling with exception-less system calls. In Remzi H. Arpaci-Dusseau and Brad Chen, editors, 9th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2010, October 4-6, 2010, Vancouver, BC, Canada, Proceedings, pages 33-46. USENIX Association, 2010. URL: http://www.usenix.org/events/osdi10/tech/full_papers/Soares.pdf.
  34. Roy Spliet, Manohar Vanga, Björn B. Brandenburg, and Sven Dziadek. Fast on average, predictable in the worst case: Exploring real-time futexes in LITMUSRT. In Proceedings of the IEEE 35th IEEE Real-Time Systems Symposium, RTSS 2014, Rome, Italy, December 2-5, 2014, pages 96-105. IEEE Computer Society, 2014. URL: https://doi.org/10.1109/RTSS.2014.33.
  35. Alexander Wieder and Björn B. Brandenburg. On spin locks in AUTOSAR: blocking analysis of fifo, unordered, and priority-ordered spin locks. In Proceedings of the IEEE 34th Real-Time Systems Symposium, RTSS 2013, Vancouver, BC, Canada, December 3-6, 2013, pages 45-56. IEEE Computer Society, 2013. URL: https://doi.org/10.1109/RTSS.2013.13.
  36. Alexander Zuepke. Deterministic fast user space synchronisation. In OSPERT Workshop, July 2013. Google Scholar
  37. Alexander Zuepke, Marc Bommert, and Daniel Lohmann. AUTOBEST: a united AUTOSAR-OS and ARINC 653 kernel. In 21st IEEE Real-Time and Embedded Technology and Applications Symposium, Seattle, WA, USA, April 13-16, 2015, pages 133-144. IEEE Computer Society, 2015. URL: https://doi.org/10.1109/RTAS.2015.7108435.
  38. Alexander Zuepke and Robert Kaiser. Deterministic futexes: Addressing WCET and bounded interference concerns. In Björn B. Brandenburg, editor, 25th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS 2019, Montreal, QC, Canada, April 16-18, 2019, pages 65-76. IEEE, 2019. URL: https://doi.org/10.1109/RTAS.2019.00014.