DagRep.14.7.153.pdf
- Filesize: 2.59 MB
- 17 pages
Machine learning (ML) enables forecasts, even in real-time, at ever lower cost and better accuracy. Today, data scientists are able to collect more data, access that data faster, and apply more complex data analysis than ever. As a result, ML impacts a variety of fields such as healthcare, finance, and entertainment. The advances in ML are mainly thanks to the exponential evolution of hardware, the availability of the large datasets, and the emergence of machine learning frameworks, which hide the complexities of the underlying hardware, boosting the productivity of data scientists. On the other hand, the computational need of the powerful ML models has increased several orders of magnitude in the past decade. A state-of-the-art large language processing model can cost of millions dollars to train in the cloud [The AI Index Report, 2024] without accounting for the electricity cost and carbon footprint [Dodge et al, 2022][Wu et al, 2024]. This makes the current rate of increase in model parameters, datasets, and compute budget unsustainable. To achieve a more sustainable progress in ML in the future, it is essential to invest in more resource-/energy-/cost-efficient solutions. In this Dagstuhl Seminar, our main goal was to reason critically about how we build software and hardware for end-to-end machine learning. The crowd was composed of experts from academia and industry across fields of data management, machine learning, compilers, systems, and computer architecture covering expertise of algorithmic optimizations in machine learning, job scheduling and resource management in distributed computing, parallel computing, and data management and processing. During the seminar, we explored how to improve ML resource efficiency through a holistic view of the ML landscape, which includes data preparation and loading, continual retraining of models in dynamic data environments, compiling ML on specialized hardware accelerators, hardware/software co-design for ML, and serving models for real-time applications with low-latency requirements and constrained resource environments. We hope that the discussions and the work planned during the seminar will lead to increased awareness for understanding the utilization of modern hardware and kickstart future developments to minimize hardware underutilization while still enabling emerging applications powered by ML.
Feedback for Dagstuhl Publishing