Exploiting user patience for scaling resource capacity in cloud services

Renato L. F. Cunha; Marcos D. Assunção; Carlos Cardonha; Marco A.S. Netto

doi:10.1109/CLOUD.2014.67

CLOUD 2014

Conference paper

03 Dec 2014

Exploiting user patience for scaling resource capacity in cloud services

View publication

Abstract

An important feature of cloud computing is its elasticity, that is, the ability to have resource capacity dynamically modified according to the current system load. Auto-scaling is challenging because it must account for two conflicting objectives: minimising system capacity available to users and maximising QoS, which typically translates to short response times. Current auto-scaling techniques are based solely on load forecasts and ignore the perception that users have from cloud services. As a consequence, providers tend to provision a volume of resources that is significantly larger than necessary to keep users satisfied. In this article, we propose a scheduling algorithm and an auto-scaling triggering technique that explore user patience in order to identify critical times when auto-scaling is needed and the appropriate volume of capacity by which the cloud platform should either extend or shrink. The proposed technique assists service providers in reducing costs related to resource allocation while keeping the same QoS to users. Our experiments show that it is possible to reduce resource-hour by up to approximately 8% compared to auto-scaling based on system utilisation.

Paper