|
RAID Explained
Defining RAID
Redundant Array of Independant (or Inexpensive) Disks, more commonly reffered to as RAID, is the integration of two or more drives in a number of different configurations. RAID configurations range from RAID 0 to RAID 6 and multiple combinations of each. Other Raid types such as Raid 10, Raid 7, and Raid S exist but will not be discuss in this guide.
This guide is intended to help you understand what the different types of raid levels do as well as assist you in making a decision on which raid level is appropriate for your needs. Raid is commonly found in the server environment which spawned initially as a need for redundancy and speed. Initially Raid required expensive controller cards and a significant level of skill to implement. Since that time raid controllers can be found which are relatively inexpensive while giving consumers excellent performance and the skill required to implement these options has dropped considerably. We would suggest you still take the time to research how raid works and the requirements for your system.
Raid solutions exist for nearly every kind of commonly found hard drive including:
- IDE
- SATA
- SATAII
- SCSI
- SAS
Different RAID Levels
A little break down of some of the RAID level is as follows:
- RAID-0: RAID Level 0 is not redundant; therefore it doesn't really meet the "RAID" acronym. This configuration is usually referred to as Striping. In RAID-0, data is split across drives, yielding higher performance, but a failure on any drive will result in complete data loss. The image below shows how the data is broken apart and spread across multiple drives. RAID 0 will result in the greatest performance in terms of read/write speed.
Since Raid 0 is the most commonly implemented consumer level raid option we will explain it a bit more. Think of a standard hard drive as a book with a librarian next to it. When you open a document you have asked the librarian to open the book search the index for your document, locate the page and turn to it. Lets say this process takes approximately 5 seconds. Now you have created a RAID 0 with two drives. You once again open your document. This time you have asked two librarians to open two books which share equal portions of your document. The process takes less time in theory since there are two physical drives at work to find half the information each. Theoretically the access time is halved in reality it is close to 1/2 but not quite. There is still overhead involved in the controller and your system putting those two halves together and working together. The net result in a significant gain in performance but not quite 2X.
Adding to this concept you can add more hard drives and gain more performance since each addition hard drive is theoretically speeding you up but in reality there is a drop off in the return in performance after about 5 hard drives.
- RAID-1: RAID Level 1 provides redundancy by writing all data to two or more drives. The performance is a bit slower, because data needs to be written to both drives. If any single drives fail, none of the data will be lost as it has been duplicated on the spare drive. At entry-level this setup is the most cost efficient, because only two drives are required. This level is also referred to as mirroring. The image below shows how the data is copied or mirrored across multiple drives. It also shows that if there is failure on one drive, the other drive is still intact and data can still be accessed.
- RAID-2: RAID Level 2 is typically used with drives that don’t have error correcting codes. RAID-2 is almost never used today because almost all drives have ECC.
- RAID-3: RAID Level 3 stripes data at a byte level across several drives.
- RAID-4: RAID Level 4 stripes data in blocks across multiple drives. RAID-4 is similar in speed to RAID-0 it terms of reads, however is slightly slower in writes due to a parity ability which is written on the first drive to allow for recovery from a failure.
- RAID-5: Block Interleaved Distributed Parity: RAID Level 5 is similar to RAID-4, but the parity is written across drives. RAID-5 needs three or more drives, typically 5.
- Raid-6: Independent Data Disks with Double Parity: Similar to Raid 5 but provides block-level striping with parity data distributed across all disks.
Considerations:
Setting up a raid will require multiple drives, as few as 2 and as great as 8 in common consumer level setups. In the enterprise arena raid goes much further and can span many hard drives. Raid will also require a controller, either onboard or dedicated (stand alone). Onboard solutions will be cheaper but the resulting performance is lessened as the controller utilizes some of the CPU to perform its job. Stand alone cards result in the greatest performance and feature sets and often carry as much as 512MB or more memory to assist in the caching and writing of data. The stand alone cards are also the most expensive option and can be as cheap as $50 or as expensive as several thousand dollars for enterprise grade hardware. Consider the level of performance you require as well as your budget before making a decision. We suggest you check out HighPoint for some excellent affordable controller options.
Most commonly used RAID setups.
RAID-0:
RAID-0 is one of the more common RAID settings. RAID-0 is known as a striped set. Data striping is the segmentation of data, where the segments can be written to multiple devices. This configuration drastically improves performance but has a harsh downside. In a RAID-0 the data is broken down into parts. The number of hard drives determines the number of parts the data is broken down into. The data is then written to the disks at the same time on the same sector. Breaking up the data like this will allow each disk to seek simultaneously, giving this configurations extremely large bandwidth. However, If a sector on one of the drives fails or is damaged the parallel sector on all other drives are useless, because part of the data is now corrupted.
Take another look at the picture below. As you can see the data is broken down segments (A1, A2, A3, ect). The data in section A1 and section A2 is written at the same time, instead of sequentially, like it would in a single non-raid drive. The performance increase here is obvious. However, if there is a failure on the second drive, for instance section A4, you see how the entire string is broken rendering it useless.
RAID-1:
Another common form of RAID is RAID-1, or a mirrored set. This is when two or more identical copies of data are stored and maintained on separate media. If one of the drives goes bad, there is an identical copy on another drive, but there is a cost disadvantage to this. You are spending twice as much for the data size of one. There is not much left to say about this type of setup, it’s just important to know that a RAID 1 setup yields fault tolerance from drive errors and failures.
Once again take a look at the picture below. Notice the data is written exactly the same on both drives, so if one of the drives fails the other one retains the data.
Conclusion
As you can see, there are many reasons why someone may choose to use a RAID setup in thier system. Some may choose to have speed while others may want to prevent data loss by having an extra copy of their hard drive. When deciding you want to go with a RAID configuration you migh want to take a look at all the various types of RAID, for there are many.