Performance of recovery time improvement algorithms for software RAIDs
Abstract
A software RAID is a RAID implemented purely in software running on a host computer. One problem with software RAIDs is that they do not have access to special hardware such as NVRAM. Thus, software RAIDs may need to check every parity group of an array for consistency following a host crash or power failure. This process of checking parity groups is called recovery, and results in long delays when the software RAID is restarted. In this paper, we review two algorithms to reduce this recovery time for software RAIDs: the PGS Bitmap algorithm we proposed in [5] and the List Algorithm proposed in [1]. We compare the performance of these two algorithms using trace-driven simulations. Our results show that the PGS Bitmap Algorithm can reduce recovery time by a factor of 12 with a response time penalty of less than 1%, or by a factor of 50 with a response time penalty of less than 2%, and a memory requirement of around 9 Kbytes. The List Algorithm can reduce recovery time by a factor of 50 but cannot achieve a response time penalty of less than 16%.