Pages

12/24/2016

Raid Types And Levels

In computer storage, the standard RAID levels comprise a basic set of RAID (redundant array of independent disks) configurations that employ the techniques of striping, mirroring, or parity to create large reliable data stores from multiple general-purpose computer hard disk drives (HDDs). The most common types are RAID 0 (striping), RAID 1 and its variants (mirroring), RAID 5 (distributed parity), and RAID 6 (dual parity). RAID levels and their associated data formats are standardized by the Storage Networking Industry Association (SNIA) in the Common RAID Disk Drive Format (DDF) standard.
While most RAID levels can provide good protection against and recovery from hardware defects or defective sectors/read errors (hard errors), they do not provide any protection against data loss due to catastrophic failures (fire, water) or soft errors such as user error, software malfunction, malware infection. For valuable data, RAID is only one building block of a larger data loss prevention and recovery scheme, it cannot replace a backup plan.

RAID Levels

There are many different ways to organize data in a RAID array. These ways are called "RAID levels". Different RAID levels have different speed and fault tolerance properties. RAID level 0 is not fault tolerant. Levels 1, 5, 6, and 1+0 are fault tolerant to a different degree - should one of the hard drives in the array fail, the data is still reconstructed on the fly and no access interruption occurs.
RAID levels 2, 3, and 4 are theoretically defined but not used in practice.
There are some more complex layouts: RAID 5E/5EE (integrating some spare space), RAID 50 and 60 (a combination of RAID 5 or 6 with RAID 0), and RAID DP. These are however beyond the scope of this reference.

RAID levels triangle

RAID levels comparison chart

RAID 0RAID 1RAID 5RAID 6RAID 10
Min number of disks22344
Fault to­le­ran­ceNone1 disk1 disk2 disks1 disk
Disk space over­headNone50%1 disk2 disks50%
Read speedFastFastSlow, see belowFast
Write speedFastFairSlow, See belowFair
Hard­ware costCheapHigh (disks)HighVery highHigh (disks)

Striping and blocks

Striping is a technique to store data on the disk array. The contigous stream of data is divided into blocks, and blocks are written to multiple disks in a specific pattern. Striping is used with RAID levels 0, 5, 6, and 10.
Block size is selected when the array is created. Typically, blocks are from 32KB to 128KB in size.

RAID Level 0 (Stripe set)

Use RAID0 when you need performance but the data is not important.
In a RAID0, the data is divided into blocks, and blocks are written to disks in turn.
RAID0 provides the most speed improvement, especially for write speed, because read and write requests are evenly distributed across all the disks in the array. Note that RAID1, Mirror, can provide the same improvement with reads but not writes. So if the request comes for, say, blocks 1, 2, and 3, each block is read from its own disk. Thus, the data is read three times faster than from a single disk.
However, RAID0 provides no fault tolerance at all. Should any of the disks in the array fail, the entire array fails and all the data is lost.
RAID0 solutions are cheap, and RAID0 uses all the disk capacity.
If RAID0 controller fails, you can do a RAID0 recovery relatively easy using RAID recovery software. However you should keep in mind that if the disk failure happens, data is lost irreversibly.
Disk 1Disk 2Disk 3
123
456
789

RAID Level 1 (Mirror)

Use mirroring when you need reliable storage of relatively small capacity.
Mirroring (RAID1) stores two identical copies of data on two hard drives. Should one of the drives fail, all the data can be read from the other drive. Mirroring does not use blocks and stripes.
Read speed can be improved in certain implementations, because read requests are sent to two drives in turn. Similar to RAID0, this should increase speed by the factor of two. However, not all implementations take advantage of this technique.
Write speed on RAID1 is the same as the write speed of a single disk, because all the copies of the data must be updated.
RAID1 uses the capacity of one of its drives to maintain fault tolearnce. This amounts to 50% capacity loss for the array. E.g. if you combine two 500GB drives in RAID1, you'd only get 500GB of usable disk space.
If RAID1 controller fails you do not need to recover neither array configuration nor data from it. To get data you should just connect any of the drives to the known-good computer.
Disk 1Disk 2
11
22
33

RAID Level 5 (Stripe with parity)

RAID5 fits as large, reliable, relatively cheap storage.
RAID5 writes data blocks evenly to all the disks, in a pattern similar to RAID0. However, one additional "parity" block is written in each row. This additional parity, derived from all the data blocks in the row, provides redundancy. If one of the drives fails and thus one block in the row is unreadable, the contents of this block can be reconstructed using parity data together with all the remaining data blocks.
If all drives are OK, read requests are distributed evenly across drives, providing read speed similar to that of RAID0. For N disks in the array, RAID0 provides N times faster reads and RAID5 provides (N-1) times faster reads. If one of the drives has failed, the read speed degrades to that of a single drive, because all blocks in a row are required to serve the request.
Write speed of a RAID5 is limited by the parity updates. For each written block, its corresponding parity block has to be read, updated, and then written back. Thus, there is no significant write speed improvement on RAID5, if any at all.
The capacity of one member drive is used to maintain fault tolerance. E.g. if you have 10 drives 1TB each, the resulting RAID5 capacity would be 9TB.
If RAID5 controller fails, you can still recover data from the array with RAID 5 recovery software. Unlike RAID0, RAID5 is redundant and it can survive one member disk failure.
While the diagram on the right might seem simple enough, there is a variety of different layouts in practical use. Left/right and synchronous/asynchronous produce four possible combinations . Further complicating the issue, certain controllers implement delayed parity.
Disk 1Disk 2Disk 3
12P
3P4
P56
78P

RAID Level 6 (Stripe with dual parity)

RAID6 is a large, highly reliable, relatively expensive storage.
RAID6 uses a block pattern similar to RAID5, but utilizes two different parity functions to derive two different parity blocks per row. If one of the drives fails, its contents are reconstructed using one set of parity data. If another drive fails before the array is recovered, the contents of the two missing drives are reconstructed by combining the remaining data and two sets of parity.
Read speed of the N-disk RAID6 is (N-2) times faster than the speed of a single drive, similar to RAID levels 0 and 5. If one or two drives fail in RAID6, the read speed degrades significantly because a reconstruction of missing blocks requires an entire row to be read.
There is no significant write speed improvement in RAID6 layout. RAID6 parity updates require even more processing than that in RAID5.
The capacity of two member drives is used to maintain fault tolerance. For an array of 10 drives 1TB each, the resulting RAID6 capacity would be 8TB.
The recovery of a RAID6 from a controller failure is fairly complicated.
Disk 1Disk 2Disk 3Disk 4
12P1P2
3P1P24
P1P256
P278P1

RAID Level 10 (Mirror over stripes)

RAID10 is a large, fast, reliable, but expensive storage.
RAID10 uses two identical RAID0 arrays to hold two identical copies of the content.
Read speed of the N-drive RAID10 array is N times faster than that of a single drive. Each drive can read its block of data independently, same as in RAID0 of N disks.
Writes are two times slower than reads, because both copies have to be updated. As far as writes are concerned, RAID10 of N disks is the same as RAID0 of N/2 disks.
Half the array capacity is used to maintain fault tolerance. In RAID10, the overhead increases with the number of disks, contrary to RAID levels 5 and 6, where the overhead is the same for any number of disks. This makes RAID10 the most expensive RAID type when scaled to large capacity.
If there is a controller failure in a RAID10, any subset of the drives forming a complete RAID0 can be recovered in the same way the RAID0 is recovered.
Similarly to RAID 5, several variations of the layout are possible in implementation.
Disk 1Disk 2Disk 3Disk 4
1212
3434
5656
7878

No comments:

Post a Comment