Abstract
A redundant array of independent disks (RAID) consisting of G disks provides protection against single disk failures by adding one parity block for each G-1 data blocks. In a clustered RAID, the G data/parity blocks are distributed over a cluster of C disks, C>G, thus reducing the additional load on each disk due to a single disk failure. The authors present a fast algorithm for distributing parity groups of size G over C disks. They create mappings for each parity group based on almost-random permutations. An analytical model is constructed to predict recovery time and read/write performance. The analysis shows that the clustered RAID is significantly more tolerant of disk failure than the basic RAID scheme. Both recovery time and performance degradation during recovery are substantially reduced in the clustered RAID.