Redundant Array of Independent (inexpensive) Disks or RAID Tobin Maginnis Updated 1-Nov-07 Background Concepts STRIPING Striping is the idea of breaking a file into chunks (e.g., 4-bit nybles, 8-bit bytes, 512 bytes, or even logical OS blocks) and writing the chunks in parallel to a set of disk drives. This design allows the reading/writing of a file across 2-10 disks and, thus, divides the total file access interval by the number of drives employed. For example, five drives would speed up file access by a factor of five times. XOR OPERATION The exclusive OR operation can be used to reconstruct lost bit(s). For example, if 1 XOR 0 = 1, then any one of the three bits can be recovered by XORing the remaining two bits. Data1 1 0 1 0 Data2 1 1 0 0 ----------- XOR 0 1 1 0 In this way a RAID controller can XOR each relative byte within a "chunk" with the same relative byte in another chunk to create a third "parity" chunk. By XORing pairs of chunks in sequence, the parity chunk can provide error correction for any set of disk drives. MEAN TIME BETWEEN FAILURES Mean Time Between Failures (MTBF) is a statistical approach to quantifying longevity of equipment. For example if you feel that a car can run for 150,000 miles and the average car is driven at 40 miles per hour, then its MTBF would be approximately 3,750 hours. Disk drives are usually rated at a MTBF of 250,000 (~29 years). However, since this is a statistical inference, and since each drive is equally likely to fail, using two drives at the same time would reduce this estimate by one half. Using four drives would reduce the MTBF by a factor of four and so on. Thus, it can be seen that a 5-10 disk RAID array would have, on average, a MTBF of 50,000-25,000 hours (6-3 years). RAID LEVELS There are many RAID "levels" or configuration types defined with new types defined from time to time. Currently there are 10 main types: * RAID-0 configuration uses striping without data redundancy. It offers the best performance, but no fault-tolerance. To ensure true parallel access, a special RAID controller marshals data to/from stripes and synchronizes the disk drives to remove any rotational latency that may exist among the parallel sectors. * RAID-1 level is also known as disk mirroring and consists of modulo two drives that duplicate data storage. In the event of drive failure, the other drive acts as a hot backup of the data. There is no striping. Read performance is improved since either disk can be read at the same time. Write performance, on the other hand, is the same as for single disk storage. RAID-1 offers a reasonable tradeoff of performance versus fault-tolerance in multi-user systems. * RAID-2 configurations use striping across disks with some disks storing a Hamming code or other special bit level error checking and correcting (ECC) information. RAID-2 was found in mainframe environments and has no advantage over RAID-3. * RAID-3 configurations use striping and the ECC XOR operation described above. Also, the parity chunks are stored on one "parity" drive. Each data chunk must be sequentially XORed before the parallel write to all drives may take place. This creates a write operation bottleneck in the RAID-3 controller. * RAID-4 level used larger, sector size chunks for stripes. This allowed overlapped read operations. But since all write operations have to update the parity drive, no overlapped write operations are possible. * RAID-5 configurations, like RAID-4, used sector size chunks and included a striping method with one more data chunk than the number of drives. Thus, the parity chunk no longer was assigned to one drive. Instead, it was written to each successive drive in a round robin fashion. With these changes, a stripe would contain two or more chunks and the previous stripe may be written while the XOR operation continued on the current chunk. This allowed read and write operations to be overlapped, but the XOR overhead meant that RAID-5 still had a longer write delay than RAID-1. Nevertheless, RAID-5 is usually preferred in multi-user systems since it offers fast read access (which tend to be the majority of disk operations) combined with data integrity. Note that storing parity information versus redundant data (as in RAID-1) requires an additional disk drive, but it also dramatically reduces overall storage costs. As a rule of thumb, RAID-5 configurations store twice as much as RAID-1 configurations. * RAID-6 configurations improve on the reliability of RAID-5 systems by employing a second additional per stripe parity scheme that was stored on a different drive than the original parity chunk. For example, the Reed-Solomon code spatially distribute information and add an additional bit of correction. * RAID-7 level employs a real-time embedded operating system in the controller as well as high-speed bus caching operations to read or write any disk and sector at any time thereby speeding up overall write operations. * RAID-10 configuration combines RAID-1 and RAID-0 designs. The user sees a high performance RAID-0 configuration, but each stripe is actually a RAID-1 array of drives. * RAID-53 configuration offers an array of stripes in which each stripe is a RAID-3 array of disks. SAN Storage Area Networks (SANs) configure a 5-to-10 disk RAID with a host processor attached to a high speed fiber network interface such as Fibre Channel or InfiniBand. A special network interface is required on the client computer, but in exchange, the $5,000-$10,000 SAN will perform disk I/O at 250MB/s or five times faster sequential access than a local hard disk. http://en.wikipedia.org/wiki/Raid http://en.wikipedia.org/wiki/Raid_0 http://en.wikipedia.org/wiki/Storage_area_network