Cache-Conscious Offline Real-Time Task Scheduling for Multi-Core Processors

Authors Viet Anh Nguyen, Damien Hardy, Isabelle Puaut



PDF
Thumbnail PDF

File

LIPIcs.ECRTS.2017.14.pdf
  • Filesize: 0.87 MB
  • 22 pages

Document Identifiers

Author Details

Viet Anh Nguyen
Damien Hardy
Isabelle Puaut

Cite As Get BibTex

Viet Anh Nguyen, Damien Hardy, and Isabelle Puaut. Cache-Conscious Offline Real-Time Task Scheduling for Multi-Core Processors. In 29th Euromicro Conference on Real-Time Systems (ECRTS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 76, pp. 14:1-14:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.ECRTS.2017.14

Abstract

Most schedulability analysis techniques for multi-core architectures assume a single Worst-Case Execution Time (WCET) per task, which is valid in all execution conditions. This assumption is too pessimistic for parallel applications running on multi-core architectures with local instruction or data caches, for which the WCET of a task depends on the cache contents at the beginning of its execution, itself depending on the task that was executed before the task under study.
  
In this paper, we propose two scheduling techniques for multi-core architectures equipped with local instruction and data caches. The two techniques schedule a parallel application modeled as a task graph, and generate a static partitioned non-preemptive schedule. We propose an optimal method, using an Integer Linear Programming (ILP) formulation, as well as a heuristic method based on list scheduling. Experimental results show that by taking into account the effect of private caches on tasks' WCETs, the length of generated schedules is significantly reduced as compared to schedules generated by cache-unaware scheduling methods. The observed schedule length reduction on streaming applications is 11% on average for the optimal method and 9% on average for the heuristic method.

Subject Classification

Keywords
  • Real-time scheduling
  • Cache-conscious scheduling
  • Many-core architectures
  • ILP
  • Static list scheduling

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. S. Altmeyer, R. I. Davis, L. Indrusiak, C. Maiza, V. Nelis, and J. Reineke. A generic and compositional framework for multicore response time analysis. In International Conference on Real Time and Networks Systems, RTNS '15, pages 129-138, 2015. Google Scholar
  2. J. H. Bahn, J. Yang, and N. Bagherzadeh. Parallel FFT algorithms on network-on-chips. In Fifth International Conference on Information Technology: New Generations (ITNG 2008), pages 1087-1093, 2008. Google Scholar
  3. M. Becker, D. Dasari, B. Nikolic, B. Akesson, V. Nélis, and T. Nolte. Contention-free execution of automotive applications on a clustered many-core platform. In 28th Euromicro Conference on Real-Time Systems, ECRTS, pages 14-24, 2016. Google Scholar
  4. J. M. Calandrino and J. H. Anderson. On the design and implementation of a cache-aware multicore real-time scheduler. In 21st Euromicro Conference on Real-Time Systems, pages 194-204, 2009. Google Scholar
  5. T. Carle, M. Djemal, D. Potop-Butucaru, R. de Simone, and Z. Zhang. Static mapping of real-time applications onto massively parallel processor arrays. In Proceedings of the 2014 14th International Conference on Application of Concurrency to System Design, ACSD '14, pages 112-121, 2014. Google Scholar
  6. S. Chattopadhyay, A. Roychoudhury, and T. Mitra. Modeling shared cache and bus in multi-cores for timing analysis. In Proceedings of the 13th International Workshop on Software &Compilers for Embedded Systems, SCOPES '10, pages 6:1-6:10, 2010. Google Scholar
  7. R. I. Davis and A. Burns. A survey of hard real-time scheduling for multiprocessor systems. ACM Computing Surveys, 43(4):35:1-35:44, 2011. Google Scholar
  8. B. D. de Dinechin, D. van Amstel, M. Poulhiès, and G. Lager. Time-critical computing on a single-chip massively parallel processor. In Proceedings of the Conference on Design, Automation &Test in Europe, DATE '14, pages 97:1-97:6, 2014. Google Scholar
  9. H. Ding, Y. Liang, and T. Mitra. Shared cache aware task mapping for WCRT minimization. In 8th Asia and South Pacific Design Automation Conference, ASP-DAC, pages 735-740, 2013. Google Scholar
  10. G. Fernandez, J. Abella, E. Quiñones, C. Rochange, T. Vardanega, and F. J. Cazorla. Contention in multicore hardware shared resources: Understanding of the state of the art. In 14th International Workshop on Worst-Case Execution Time Analysis, OpenAccess Series in Informatics (OASIcs), pages 31-42, 2014. Google Scholar
  11. N. Guan, M. Stigge, W. Yi, and G. Yu. Cache-aware scheduling and analysis for multicores. In Proceedings of the Seventh ACM International Conference on Embedded Software, EMSOFT '09, pages 245-254, 2009. Google Scholar
  12. Inc. Gurobi Optimization. Gurobi optimizer reference manual, 2015. Google Scholar
  13. D. Hardy, T. Piquet, and I. Puaut. Using bypass to tighten WCET estimates for multi-core processors with shared instruction caches. In Proceedings of the 30th IEEE Real-Time Systems Symposium, RTSS, pages 68-77, 2009. Google Scholar
  14. H. Kasahara and S. Narita. Practical multiprocessor scheduling algorithms for efficient parallel processing. IEEE Trans. Comput., 33(11):1023-1029, 1984. Google Scholar
  15. T. Kelter, H. Falk, P. Marwedel, S. Chattopadhyay, and A. Roychoudhury. Static analysis of multi-core tdma resource arbitration delays. Real-Time Syst., 50(2):185-229, 2014. Google Scholar
  16. Y.-K. Kwok and I. Ahmad. Benchmarking and comparison of the task graph scheduling algorithms. In Journal of Parallel and Distributed Computing, volume 59, pages 381-422, 1999. Google Scholar
  17. Y.-K. Kwok and I. Ahmad. Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Computing Surveys, 31(4):406-471, 1999. Google Scholar
  18. Y. Liang, H. Ding, T. Mitra, A. Roychoudhury, Y. Li, and V. Suhendra. Timing analysis of concurrent programs running on shared cache multi-cores. Real-time Systems, 48(6):638-680, 2012. Google Scholar
  19. V. Nélis, P. M. Yomsi, and L. M. Pinho. The variability of application execution times on a multi-core platform. In 16th International Workshop on Worst-Case Execution Time Analysis (WCET 2016), OpenAccess Series in Informatics (OASIcs), pages 1-11, 2016. Google Scholar
  20. V. Nélis, P. M. Yomsi, L. M. Pinho, J. C. Fonseca, M. Bertogna, E. Quiñones, R. Vargas, and A. Marongiu. The challenge of time-predictability in modern many-core architectures. In 14th International Workshop on Worst-Case Execution Time Analysis, volume 39 of OpenAccess Series in Informatics (OASIcs), pages 63-72, 2014. Google Scholar
  21. F. Nemer, H. Cassé, P. Sainrat, and A. Awada. Improving the worst-case execution time accuracy by inter-task instruction cache analysis. In IEEE Second International Symposium on Industrial Embedded Systems, SIES, pages 25-32, 2007. Google Scholar
  22. R. Pellizzoni, E. Betti, S. Bak, G. Yao, J. Criswell, M. Caccamo, and R. Kegley. A predictable execution model for cots-based embedded systems. In Proceedings of the 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS '11, pages 269-279, 2011. Google Scholar
  23. Q. Perret, P. Maurère, E. Noulard, C. Pagetti, P. Sainrat, and B. Triquet. Mapping hard real-time applications on many-core processors. In Proceedings of the 24th International Conference on Real-Time Networks and Systems, RTNS '16, pages 235-244. ACM, 2016. Google Scholar
  24. Q. Perret, P. Maurère, E. Noulard, C. Pagetti, P. Sainrat, and B. Triquet. Temporal isolation of hard real-time applications on many-core processors. In 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 37-47, 2016. Google Scholar
  25. G. Phavorin, P. Richard, J. Goossens, T. Chapeaux, and C. Maiza. Scheduling with preemption delays: Anomalies and issues. In Proceedings of the 23rd International Conference on Real Time and Networks Systems, RTNS '15, pages 109-118, 2015. Google Scholar
  26. D. Potop-Butucaru and I. Puaut. Integrated worst-case execution time estimation of multicore applications. In 13th International Workshop on Worst-Case Execution Time Analysis, volume 30, pages 21-31, 2013. Google Scholar
  27. W. Puffitsch, E. Noulard, and C. Pagetti. Off-line mapping of multi-rate dependent task sets to many-core platforms. Real-Time Systems, 51(5):526-565, 2015. Google Scholar
  28. H. Rihani, M. Moy, C. Maiza, R. I. Davis, and S. Altmeyer. Response time analysis of synchronous data flow programs on a many-core processor. In Proceedings of the 24th International Conference on Real-Time Networks and Systems, RTNS '16, pages 67-76, 2016. Google Scholar
  29. V. Suhendra, C. Raghavan, and T. Mitra. Integrated scratchpad memory optimization and task scheduling for mpsoc architectures. In International Conference on Compilers, Architecture and Synthesis for Embedded Systems, CASES '06, pages 401-410, 2006. Google Scholar
  30. P. Tendulkar, P. Poplavko, I. Galanommatis, and O. Maler. Many-core scheduling of data parallel applications using SMT solvers. In 17th Euromicro Conference on Digital System Design, DSD, pages 615-622, 2014. Google Scholar
  31. C. Tessler and N. Fisher. BUNDLE: real-time multi-threaded scheduling to reduce cache contention. In IEEE Real-Time Systems Symposium, RTSS, pages 279-290, 2016. Google Scholar
  32. W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT '10, pages 365-376, 2010. Google Scholar
  33. B. C. Ward, A. Thekkilakattil, and J. H. Anderson. Optimizing preemption-overhead accounting in multiprocessor real-time systems. In Proceedings of the 22Nd International Conference on Real-Time Networks and Systems, RTNS '14, pages 235:235-235:243, 2014. Google Scholar
  34. R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. Ferdinand, R. Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and P. Stenström. The worst-case execution-time problem: Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst., 7(3):36:1-36:53, 2008. Google Scholar
  35. G. Yao, R. Pellizzoni, S. Bak, E. Betti, and M. Caccamo. Memory-centric scheduling for multicore hard real-time systems. Real-Time Systems, 48(6):681-715, 2012. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail