Scheduling Splittable Jobs on Configurable Machines

Authors Matthew Casey , Rajmohan Rajaraman , David Stalfa, Cheng Tan



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2024.22.pdf
  • Filesize: 0.97 MB
  • 20 pages

Document Identifiers

Author Details

Matthew Casey
  • Northeastern University, Boston MA 02115, USA
Rajmohan Rajaraman
  • Northeastern University, Boston MA 02115, USA
David Stalfa
  • Northeastern University, Boston MA 02115, USA
Cheng Tan
  • Northeastern University, Boston MA 02115, USA

Cite AsGet BibTex

Matthew Casey, Rajmohan Rajaraman, David Stalfa, and Cheng Tan. Scheduling Splittable Jobs on Configurable Machines. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 22:1-22:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2024.22

Abstract

Motivated by modern architectures allowing for the partitioning of a GPU into hardware separated instances, we initiate the study of scheduling splittable jobs on configurable machines. We consider machines that can be configured into smaller instances, which we call blocks, in multiple ways, each of which is referred to as a configuration. We introduce the Configurable Machine Scheduling (cms) problem, where we are given n jobs and a set C of configurations. A schedule consists of a set of machines, each assigned some configuration in C with each block in the configuration assigned to process one job. The amount of a job’s demand that is satisfied by a block is given by an arbitrary function of the job and block. The objective is to construct a schedule using as few machines as possible. We provide a tight logarithmic factor approximation algorithm for this problem in the general setting, a factor (3 + ε) approximation algorithm for arbitrary ε > 0 when there are O(1) input configurations, and a polynomial time approximation scheme when both the number and size of configurations are O(1). Finally, we utilize a technique for finding conic integer combinations in fixed dimension to develop an optimal polynomial time algorithm in the case with O(1) jobs, O(1) blocks, and every configuration up to a given size.

Subject Classification

ACM Subject Classification
  • Theory of computation → Scheduling algorithms
Keywords
  • Scheduling algorithms
  • Approximation algorithms
  • Configurable machines
  • Splittable jobs
  • Linear programming

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Nikhil Bansal and Maxim Sviridenko. The santa claus problem. In Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, STOC '06, pages 31-40, New York, NY, USA, 2006. Association for Computing Machinery. URL: https://doi.org/10.1145/1132516.1132522.
  2. Deeparnab Chakrabarty, Julia Chuzhoy, and Sanjeev Khanna. On allocating goods to maximize fairness. In 2009 50th Annual IEEE Symposium on Foundations of Computer Science, pages 107-116, 2009. URL: https://doi.org/10.1109/FOCS.2009.51.
  3. S. W. Cheng and Y. Mao. Restricted max-min allocation: Integrality gap and approximation algorithm. Algorithmica, 84:1835-1874, 2022. Google Scholar
  4. Irit Dinur and David Steurer. Analytical approach to parallel repetition. In Proceedings of the Forty-Sixth Annual ACM Symposium on Theory of Computing, STOC '14, pages 624-633, New York, NY, USA, 2014. Association for Computing Machinery. URL: https://doi.org/10.1145/2591796.2591884.
  5. Michel X. Goemans and Thomas Rothvoss. Polynomiality for bin packing with a constant number of item types. J. ACM, 67(6), November 2020. URL: https://doi.org/10.1145/3421750.
  6. Qiang-Sheng Hua, Amy Wang, Dongxiao Yu, and Francis Lau. Dynamic programming based algorithms for set multicover and multiset multicover problem. Theor. Comput. Sci., 411:2467-2474, June 2010. URL: https://doi.org/10.1016/j.tcs.2010.02.016.
  7. Zhihao Jiang and Haoyu Zhao. An fptas for stochastic unbounded min-knapsack problem. In Yijia Chen, Xiaotie Deng, and Mei Lu, editors, Frontiers in Algorithmics, pages 121-132, Cham, 2019. Springer International Publishing. Google Scholar
  8. B. Korte and J. Vygen. Bin-Packing, pages 426-441. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006. URL: https://doi.org/10.1007/3-540-29297-7_18.
  9. Jan Karel Lenstra, David B. Shmoys, and Eva Tardos. Approximation algorithms for scheduling unrelated parallel machines. In 28th Annual Symposium on Foundations of Computer Science (sfcs 1987), pages 217-224, 1987. URL: https://doi.org/10.1109/SFCS.1987.8.
  10. Baolin Li, Tirthak Patel, Siddharth Samsi, Vijay Gadepally, and Devesh Tiwari. Miso: exploiting multi-instance gpu capability on multi-tenant gpu clusters. In Proceedings of the 13th Symposium on Cloud Computing, pages 173-189, 2022. Google Scholar
  11. Baolin Li, Siddharth Samsi, Vijay Gadepally, and Devesh Tiwari. Clover: Toward sustainable ai with carbon-aware machine learning inference service. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1-15, 2023. Google Scholar
  12. R. J. Lipton, E. Markakis, E. Mossel, and A. Saberi. On approximately fair allocations of indivisible goods. In Proceedings of the 5th ACM Conference on Electronic Commerce, EC '04, pages 125-131, New York, NY, USA, 2004. Association for Computing Machinery. URL: https://doi.org/10.1145/988772.988792.
  13. NVIDIA Multi-Instance GPU User Guide. https://docs.nvidia.com/datacenter/tesla/pdf/NVIDIA_MIG_User_Guide.pdf, 2024.
  14. Deepak Narayanan, Fiodar Kazhamiaka, Firas Abuzaid, Peter Kraft, Akshay Agrawal, Srikanth Kandula, Stephen Boyd, and Matei Zaharia. Solving large-scale granular resource allocation problems efficiently with pop. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, pages 521-537, 2021. Google Scholar
  15. Sridhar Rajagopalan and Vijay V. Vazirani. Primal-dual rnc approximation algorithms for set cover and covering integer programs. SIAM J. Comput., 28:525-540, 1999. URL: https://api.semanticscholar.org/CorpusID:36747871.
  16. Haichen Shen, Lequn Chen, Yuchen Jin, Liangyu Zhao, Bingyu Kong, Matthai Philipose, Arvind Krishnamurthy, and Ravi Sundaram. Nexus: A gpu cluster engine for accelerating dnn-based video analysis. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, pages 322-337, 2019. Google Scholar
  17. Cheng Tan, Zhichao Li, Jian Zhang, Yu Cao, Sikai Qi, Zherui Liu, Yibo Zhu, and Chuanxiong Guo. Serving DNN models with multi-instance GPUs: A case of the reconfigurable machine scheduling problem, 2021. arxiv:2109.11067. URL: https://arxiv.org/abs/2109.11067.
  18. Vijay V. Vazirani. Approximation Algorithms. Springer Publishing Company, Incorporated, 2010. Google Scholar
  19. Wencong Xiao, Shiru Ren, Yong Li, Yang Zhang, Pengyang Hou, Zhi Li, Yihui Feng, Wei Lin, and Yangqing Jia. AntMan: Dynamic scaling on GPU clusters for deep learning. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 533-548, 2020. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail