this post was submitted on 14 May 2024
158 points (98.2% liked)
PC Gaming
8576 readers
266 users here now
For PC gaming news and discussion. PCGamingWiki
Rules:
- Be Respectful.
- No Spam or Porn.
- No Advertising.
- No Memes.
- No Tech Support.
- No questions about buying/building computers.
- No game suggestions, friend requests, surveys, or begging.
- No Let's Plays, streams, highlight reels/montages, random videos or shorts.
- No off-topic posts/comments, within reason.
- Use the original source, no clickbait titles, no duplicates. (Submissions should be from the original source if possible, unless from paywalled or non-english sources. If the title is clickbait or lacks context you may lightly edit the title.)
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
TL:DR; Bigger drives reduces the risk of data loss overtime. Please backup your data. RAID is not a backup.
As drives get bigger and bigger, the emotionally risk you feel when you fill them up is real. However, that is not the best way to think about it. Drives will inevitably fail, and drives are easily replaced commodities, their failure should be expected, and handled appropriately. RAID is not a backup, and does not reduce your risk of drive failure. RAID creates a safer environment for your data when a drive fails. How you should think about RAID is as if you are replacing a failed drive in advance, not as a reduction of risk of the drive failing.
To illustrate my point, we have Y of data to store. I can either split the data across X number drives, or store it all on a single drive. Which is safer? A single drive is objectively safer, given the same failure rate. So we have two cases for this situation. In both cases, this imaginary drive fails 10% of the time. The exact amount doesn't matter so long as they are reasonably close.
Case A: You have 1 drive holding all your data. There is a 1/10 chance it fails. Your risk is 10%.
Case B: You have X drives holding all your data. Each drive has a 1/10 chance of failing. so a 1−(9/10)^X chance any of the drives fail. For all of X, your rate of failure is higher than 1/10. For two drives you have 19% chance of failure, three drives is 27%.
In all cases your rate of failure increases the more drives you add to hold your data. Please do not become confused by what RAID does for this illustration. RAID will not prevent drive failures. RAID allows you to, in essence, "pre-fail" a drive in advance. A drive will fail, and some RAID configurations(1,5,6) will replace the functionality of the failed drive until you can replace the "real" failed drive. RAID did not prevent your drive failure, it only moved the time the failure happened to be convenient for the user. A RAID1 array with a failed drive is still a failed drive that needs to be replaced, and still needs to be restored from backup/re-striped.
Let's take the cases of no RAID vs RAID1.
Case A: You have 1 drive holding all your data. When the drives fails, you stop your work, and replace the drive immediately.
Case RAID1: You have 1 drive holding all your data. You continue working because you've been very busy. You replace the drive when you have some downtime a week later.
In Case A, you had lost productivity because the drive failed at an inconvenient time, in the RAID1 case you could schedule the drive replacement for a later date when you had some spare time, huge improvement in the user experience. But wait! I said in the case of RAID1 only one of the drives was holding my data, should I have said 2 drives were? Yes, in a literal sense the RAID1 holds a copy of the data in the second drive. However, RAID is not a backup, it is a system to schedule the time of drive failures. Your backup of the RAID array is what holds a real second copy of your data, not your mirrored drive, because RAID is not a backup. Your second drive was still present in Case A, it was just replaced after the failure occurred, rather than before the first one failed.
Be safe with your data. please make backups, and verify you can restore from them regularly. RAID is not a backup.
You're assuming that the failure rate for drives are all the same though. Aren't the failure rates for new high capacity drives typically higher?
Yes their failure rates are usually a bit higher, but usually less than the increase in rate from using more than one disk instead. A bit of math can be done using Backblaze's disk failure rate data to get a reasonable approximation of the overall risk of failure.