this post was submitted on 05 Apr 2024
70 points (97.3% liked)
Asklemmy
43856 readers
1606 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy ๐
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- [email protected]: a community for finding communities
~Icon~ ~by~ ~@Double_[email protected]~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I wholeheartedly agree with you. It is worth noting that a lot of the use cases of RAID can now be solved via software, but there are some places where hardware RAID still shines, such as redundancy. Yes, software also can provide redundancy, but I still haven't seen a software solution that is equivalent to a proper RAID controller with a dedicated battery to keep the I/O buffer alive in case of hardware failure. That one has saved me a few times.
Source: I'm in charge of 6 storage clusters at work. Beegfs is what takes care of the actual clustering, resulting in each cluster clocking in at 1.2PB of storage. Each cluster consists of four machines with three storage volumes each.
Each storage volume consists of 12 drives in a RAID6 configuration.
I can yank faulty drives and toss them out and have them replaced with no downtime. I know some like to set up hot spares, but I for one don't. I've even had entire servers die on me, and thanks to additional redundancy provided by beegfs, I've changed motherboard with no cluster downtime either. Just move the drives over to an identical machine (yes, each cluster has a dedicated spare machine), import the RAID, and you're good to go.
Unless I'm misunderstanding, that sounds like you're worried about the write hole, which RAIDZ doesn't have
It's mostly a matter of making sure any writes that are interrupted part way through (power failure, etc) are kept alive until the issue has been resolved. The raid controller caches everything until the write is complete.
It's not so much about disks being out of sync, but more about preventing data loss.
RAIDZ is copy-on-write, and will notice and correct parity discrepancies if interrupted partway through. Doesn't help if you don't get at least one copy of the data written, but I'd take RAIDZ and a UPS over a hardware raid any day
And at the scale I'm operating, I'll take hardware raid over raidz any day. I did some performance benchmarking when initially building these clusters, and beegfs really doesn't like raidz.
I use raidz at home, though.
That's fair. My biggest concern with a hardware raid is the risk having trouble finding compatible hardware if/when a controller dies, but I expect that's not really an issue at larger scale; you probably buy hardware in bulk and have replacements on hand