Abstract
In this article we propose a novel, yet practical, scheme which attempts to optimally balance the load on the servers of a clustered Web farm. The goal in solving this performance problem is to achieve minimal average response time for customer requests, and thus ultimately achieve maximal customer throughput. The article decouples the overall problem into two related but distinct mathematical subproblems, one static and one dynamic. We believe this natural decoupling is one of the major contributions of our article. The static component algorithm determines good assignments of sites to potentially overlapping servers. These cluster assignments, which, due to overhead, cannot be changed too frequently, have a major effect on achievable response time. Additionally, these assignments must be palatable to the sites themselves. The dynamic component algorithm is designed to handle real-time load balancing by routing customer requests from the network dispatcher to the servers. This algorithm must react to fluctuating customer request load while respecting the assignments of sites to servers determined by the static component. The static and dynamic components both employ in various contexts the same so-called goal setting algorithm. This algorithm determines the theoretically optimal load on each server, given hypothetical cluster assignments and site activity. We demonstrate the effectiveness of the overall load-balancing scheme via a number of simulation experiments. © 2001, ACM. All rights reserved.