RAID servers: the ultimate defence against data loss?
Data storage on RAID servers is proving popular with businesses: fault tolerance, continuous availability or data reconstruction in the event of an incident, business continuity… Could RAID systems be the ultimate bulwark against data loss?
RAID architectures: reliability and performance
RAID stands for Redundant Array of Independent Disks. That’s how it works: RAID systems are storage units made up of several hard disks (or clusters) on which data is distributed.
There are several RAID levels, combining hard disks and distributing data in different ways according to the desired purpose. The most common levels include :
- RAID 0 or striping: the interleaving of disks(striping) and the distribution of data across different volumes are designed to improve read and write speeds. However, the absence of data redundancy means that there is no fault tolerance.
- RAID 1 or mirroring: data is replicated in real time(mirroring) on all the hard disks in the array. This type of redundancy ensures continuous data availability in the event of an incident.
- RAID 5 or block-interleaved distributed parity : each hard disk contains a part of the data as well as a parity block: this is redundancy information that enables lost data to be reconstructed in the event of failure of one of the hard disks.
- RAID 10 or RAID 1+0: combines RAID 0 interleaving (data distributed across a cluster) and RAID 1 duplication (each cluster replicated on an equivalent cluster). It combines reliability, fault tolerance and read/write performance.
RAID architectures are based on the principle that several storage units are unlikely to fail simultaneously. Their purpose is to improve read and write performance (RAID 0, RAID 5) and/or fault tolerance with high data availability (RAID with redundancy).
RAID levels with redundancy are able to function in the event of the loss of one of the hard disks. The system remains accessible in the event of an incident. Whether data is continuously available or needs to be rebuilt, business can continue uninterrupted.
RAID system failures, or why you should anticipate the aftermath.
Although highly reliable, RAID systems are not infallible. They are vulnerable to the same external causes and types of failure as any other hard disk. They can also be subject to other scenarios:
- Failure of several hard disks : the hard disks in a system often come from the same series, so the probability of several of them failing at the same time is very real.
- RAID controller failure preventing access to drives, resulting in loss of data access.
- RAID controller malfunctions leading to RAID configuration alterations or errors.
- Errors or corruption during a data reconstruction phase.
Relying entirely on the reliability and performance of a RAID system is therefore insufficient. Under the provisions of the RGPD (General Data Protection Regulation) , companies are legally obliged to deploy the necessary means to ensure data security and, in the event of an incident, to restore its integrity and availability.
Storage on a RAID system with redundancy in itself provides excellent data protection. However, every company needs to anticipate potential incident or disaster scenarios that could lead to data loss. Risk analysis, impact analysis, definition of preventive and corrective measures will enable them to define the most appropriate BCP (Business Continuity Plan) and BRP (Business Resumption Plan).
29 May 2018