Power-aware Deep Learning Model Serving with µ-Serve
Haoran Qiu, Weichao Mao, et al.
USENIX ATC 2024
With the surge in demand of computational resources in data centers and their environmental impact, cloud platforms are increasingly emphasizing the need for technologies to reduce carbon emission associated with the high computational energy consumption. Due to the variability in renewable energy availability over time and in the variety of power sources in grid regions, carbon intensity due to electricity generation varies by time and location. Thus, considering these variations could significantly reduce carbon emission. This talk introduces Caspian, a carbon-aware workload dispatcher in multi-cluster Kubernetes environments, which uses the Kube-Stellar platform to distribute workloads among clusters. Caspian includes an optimizer that decides the placement and scheduling of the workloads over geographically distributed data centers based on the carbon intensity and resource availability of clusters. Our experimental analysis shows that Caspian could effectively reduce the carbon emission associated with the computational energy consumption compared to the default scheduler in Kube-Stellar platform.
Haoran Qiu, Weichao Mao, et al.
USENIX ATC 2024
Runyu Jin, Paul Muench, et al.
ICPE 2024
Shiqiang Wang, Mingyue Ji
NeurIPS 2022
Frank Libsch, Steve Bedell, et al.
ECTC 2024