Resource As You Wish: Collaborative Reservation and Allocation by Scheduler Plugin and Device Plugin

Takuya Mishina; Tatsuhiro Chiba

KubeDay Japan 2024

Talk

27 Aug 2024

Resource As You Wish: Collaborative Reservation and Allocation by Scheduler Plugin and Device Plugin

Abstract

Kubernetes encapsulates the details of infrastructure so that user can describe their desired state as reusable manifest. However, such nature prevents us from maximizing utilization of peripheral resources because hardware topology heavily impacts on the performance of computation workloads. For example, inter-device and inter-node communication acceleration technologies like Direct Memory Access (DMA) become unavailable if multiple GPUs under different PCI Express Bridge are allocated to a single set of AI workloads.

This presentation shows a concrete usecase of Kubernetes Scheduling Framework. Scheduler plugins, which leverages the framework, jointly works with device plugin to enable users to allocate their preferred AI hardware devices to computing workloads. In addition to utilizing existing stable technologies, the mechanism with Dynamic Resource Allocation (DRA), a promising technology for device management in Kubernetes, will also be included.

Workshop paper