Redundant Array of Independent Disks (RAID) is a term used to describe the technique of
improving data availability through the use of arrays of disks and various data-striping
methodologies. Disk arrays are groups of disk drives that work together to achieve higher datatransfer
and I/O rates than those provided by single large drives. An array is a set of multiple disk
drives plus a specialized controller (an array controller) that keeps track of how data is distributed
across the drives. Data for a particular file is written in segments to the different drives in the
array rather than being written to a single drive.
For speed and reliability, it is better to have more disks. When these disks are arranged in certain
patterns and are use a specific controller, they are called a Redundant Array of Inexpensive Disks
(RAID) set. There are several numbers associated with RAID, but the most common are 1, 5 and
RAID 1 works by duplicating the same writes on two hard drives. Let us assume you have two
20-Gigabyte drives. In RAID 1, data is written at the same time to both the drives. RAID1 is
optimized for fast writes.
RAID 5 works by writing parts of data across all drives in the set (it requires at least three drives).
If a drive failed, the entire set would be worthless. To combat this problem, one of the drives
stores a "parity" bit. Think of a math problem, such as 3 + 7 = 10. You can think of the drives as
storing one of the numbers, and the 10 is the parity part. By removing any one of the numbers,
you can get it back by referring to the other two, like this: 3 + X = 10. Of course, losing more
than one could be evil. RAID 5 is optimized for reads.
RAID 10 is a bit of a combination of both types. It does not store a parity bit, so it is faster, but it
duplicates the data on two drives to be safe. You need at least four drives for RAID 10. This type
of RAID is probably the best compromise for a database server.