Optimizing energy, locality and priority in a MapReduce cluster
Abstract
To strike a balance between optimizing for energy versus performance in data centers is extremely tricky because the workloads are significantly different with varying constraints on performance. This issue is exacerbated with the introduction of MapReduce over and above conventional web applications. In particular, with batch versus interactive MapReduce, e.g., Spark system, data availability and locality drive performance while exhibiting different degrees of delay sensitivities. In this paper we consider an energy minimization framework (which is formulated as a concave minimization problem) with explicit modeling of (i) time variability, (ii) data locality, and (iii) delay sensitivity of web applications, batch MapReduce, and interactive MapReduce. Our objective is to maximize the usage of MapReduce servers by delaying the batch MapReduce and offering the execution to web workloads whenever capacity permits. We propose a two-step approach which first employs a controller dynamically allocating servers to the three types of workloads and secondly designs a MapReduce scheduler achieving the optimal data locality. To cater to the stochastic nature of workloads, we use a Makov Decision Process model to design the allocation algorithm at the controller and derive the structure of the optimal. The proposed locality-aware scheduler is specifically engineered to sustain the throughput during the transient overload caused by insufficient server allocation for the batch-MapReduce. We conclude by presenting simulation results from an extensive set of experiments, and these results indicate the efficacy of the methodology proposed by keeping the data center costs to a minimum while ensuring the delay constraints of workloads are met.