RAID is an acronym for Redundant Array of Independent Disks, and it’s a core feature of server hardware that ensures data integrity. It’s also just a fancy word for two or more hard disks connected together to add some additional functionality. Why would you want to do this? Read on.
First off, it’s very difficult to describe RAID technologies as a whole, because the different configurations available to you create very different functionalities – but they all focus on either speed or reliability. Let’s break them down:
RAID 0: Striped
This configuration is all about speed. In short, data is spread across a number of disks (striped across the disks, in fact) – rather than being written to just one. This overcomes speed limitations of a single drive, so performance is theoretically multiplied by the number of disks you are using.
It’s a similar concept to having 4 cores in your CPU – instead of sequentially writing instructions to one CPU, you send different parts of it to 4 different CPUs, and get the answers back 4 times as quickly. You also get to use the combined space of all of the drives, so 2 x 1TB in a striped configuration will show as a single 2TB drive.
On the downside, you also have as many points of failure as drives you are using – if just one of those drives fails, all your data will be lost. In reality then, this configuration is rarely used. If the data isn’t so valuable though, you might want to set up a RAID0 on a home server or even a desktop machine.
RAID 1: Mirrored
This configuration is all about data integrity and is far easier to explain. In a RAID 1 setup, data is mirrored to the other drives – a full backup of everything is kept at all times, because the bits of data are simultaneously written to different drives, at the same time. Because of this, you only get the total drive space of a single drive, so 2 x 1TB drives set up to mirror each other will only give you 1TB total space.
This is perhaps the most common real world usage when two disks are available. When one dies, the data is still 100% there and ready for use, but the process of “re-building” the data array on the replacement drive can take a very long time.
RAID 0+1: Striped & Mirrored
This combines the best of both worlds by nesting RAID setups, but requires at least 4 disks. 2 sets of 2 striped disks are then set up, each set replicated to the other. RAID 1+0 also exists, but doesn’t vary enough to warrant a separate explanation – it’s a case of striping your mirrors rather than mirroring your stripes!
RAID 2 & Above: Parity Bits
With 3 disks, you can actually achieve a good performance and integrity compromise by using what is called a parity disk. To explain this, think on a scale of bits instead of whole drives.
A parity bit is simply an XOR combination on the other bits. XOR is a logic operation that evaluates to true if only ONE of the two input bits is true. See the following table, where P is the parity bit.
A B P
0 0 0
0 1 1
1 0 1
1 1 0
Now it turns out that this is very useful for error checking and repairing the data. If you were to erase the entire B column, you could rebuild it simply because you still have both the parity bit and A, and given those then there is only one possible answer for bit B.
Now, it should be easy to see that even if we had 2 x 1 terabyte drives worth of bits, we could still create a parity for every single bit and place it on a 3rd drive that’s also a terabyte in size. And that’s RAID3. With a 3 disk array, 2 are used to stripe the data, spreading it out for performance. The 3rd drive creates a parity set, and if any one of those drives dies, we can use the other 2 to recover it in full.
I won’t go into details about RAID 3, 4, 5 and 6 because they’re basically all variants on where and how parity bits are stored or derived, and precisely how much recovery can be done. If you’d like to read up on those, I’d suggest the extensive Wikipedia page on the topic.
Can I Use RAID On My Home PC? Should I?
Both OSX and Windows have the ability to create software RAID configurations, but bear in mind that this is going to increase the load on your operating system due to the additional computation required. I won’t go into setting them up here, but if you’d like to know more or see a tutorial on MakeUseOf, let me know in the comments and I’ll get straight on it.
A lot of motherboards also include a form of semi-hardware RAID – I say semi-hardware, because they generally still need a driver in your OS to be able to access the data, but this is still one step up from a purely software RAID, and you can even install the OS onto them for a small performance boost.
The final method of doing RAID is with dedicated hardware – upgrade cards that you can slot into your PC and take full control of the data side of things. These are of course the most reliable and best performing, but the price range is generally out of consumer budgets.
As for whether you should be using a RAID, it’s certainly worth playing around with for geek points. In terms of real world computing, the performance gains you can expect are often less than the trouble involved (an SSD would far outperform them anyway), or the data redundancy you gain can be easily achieved with other traditional backup methods.
Check out the other Technology Explained articles for more fascinating insights into the technologies behind computers and the Internet.