Chidanand Apté, Fred Damerau, et al.
ACM Transactions on Information Systems (TOIS)
In this paper we extend a previously published approach to error recovery in enterprise storage controllers with multi-core processors. Our approach first involves the partitioning of the set of tasks in the runtime of the controller software into clusters (recovery scopes) of dependent tasks. Then, these recovery scopes are mapped into a set of recovery groups, on which the scheduling of tasks, both during the recovery process and normal operation, is based. This recovery-aware scheduling (RAS) replaces the performance-based scheduling of the storage controller. Through simulation and benchmark experiments, we find that: 1) the performance of RAS appears to be critically dependent on the values of recovery-related parameters; and 2) our fine-grained recovery approach promises to enhance the storage system availability while keeping the additional overhead, and the resulting degradation in performance, under control. © Copyright 2009 by International Business Machines Corporation.
Chidanand Apté, Fred Damerau, et al.
ACM Transactions on Information Systems (TOIS)
Chi-Leung Wong, Zehra Sura, et al.
I-SPAN 2002
Apostol Natsev, Alexander Haubold, et al.
MMSP 2007
Kaoutar El Maghraoui, Gokul Kandiraju, et al.
WOSP/SIPEW 2010