this post was submitted on 05 Jun 2024
12 points (92.9% liked)

Selfhosted

40200 readers
716 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

I'm planning on building a new home server and was thinking about the possibility to use disc spanning to create matching disk sizes for a RAID array. I have 2x2TB drives and 4x4TB drives.

Comparison with RAID 5

4 x 4 TB drives

  • 1 RAID array
  • 12 TB total

4 x 4 TB drives & 2 x 2 TB drives

  • 2 RAID arrays
  • 14 TB total

5 x 4* TB drives

  • Several 4TB disks and 2 smaller disks spanned to produce a 4 TB block device
  • 16 TB total

I'm not actually planning on actually doing this because this setup will probably have all kinds of problems, however I do wonder, what would those problems be?

all 10 comments
sorted by: hot top controversial new old
[–] [email protected] 4 points 5 months ago

Synology has it's own version of raid5 that can handle your specific disk configuration without any modification:

https://www.synology.com/en-us/support/RAID_calculator?drives=4%20TB%7C4%20TB%7C4%20TB%7C4%20TB%7C2%20TB%7C2%20TB&raid=SHR_1%7CRAID_5

Not sure if similar things are availible on other platforms.

[–] [email protected] 3 points 5 months ago (1 children)

What are you going to be running on these disks? I haven't used zfs, maybe it supports mismatched sizes? Or maybe you could do one array with the 4s, another with the 2s, and use LVM to pool them together? Or just keep them separate and fill them up independently.

[–] [email protected] 4 points 5 months ago

ZFS doesn't really support mismatched disks. In OP's case it would behave as if it was 4x 2TB disks, making 4 TB of raw storage unusable, with 1 disk of parity that would yield 6TB of usable storage. In the future the 2x 2TB disks could be swapped with 4 TB disks, and then ZFS would make use of all the storage, yielding 12 TB of usable storage.

BTRFS handles mismatched disks just fine, however it's RAID5 and RAID6 modes are still partially broken. RAID1 works fine, but results in half the storage being used for parity, so this would again yield a total of 6TB usable with the current disks.

[–] [email protected] 3 points 5 months ago

I ran RAID-Z2 across 4x14TB and a (4+8)TB LVM LV for close to a year before finally swapping the (4+8)TB LV for a 5th 14TB drive for via zpool replace without issue. I did, however, make sure to use RAID-Z2 rather than Z1 to account for said shenanigans out of an abundance of caution and I would highly recommend doing the same. That is to say, the extra 2x2TB would be good additional parity, but I would only consider it as additional parity, not the only parity.

Based on fairly unscientific testing from before and after, it did not appear to meaningfully affect performance.

[–] [email protected] 3 points 5 months ago

Typical problems with parity arrays are:

  • They suffer from something called "write hole". If power fails while information is being written to the array, different drives can end up with conflicting versions of the information and no way to reconcile it. The software solution is to use ZFS, but ZFS has a pretty steep learning curve and is not easy to manage. The hardware solution is to make sure power to the array never fails, by using either an UPS to the machine or connecting the drives through a PCI card with a battery, which allows them to always finish write operations even without power.
  • Making up a 4 TB out of 2x2 TB is not a good idea, you're basically doubling the failure probability of that particular "4 TB" drive.
  • Parity arrays usually require drives to be all the same size. Meaning that if you want to upgrade your array you need to buy as many drives before you can take advantage of the increased space. There are parity schemes like Unraid that work around this by using only one large parity drive that computes parities across all the others regardless of their sizes; but Unraid is proprietary and requires a paid subscription.
  • If a drive fails, rebuilding the array after replacing that drive requires an intensive pass through all the surviving members of the array. This can greatly increase the risk of another drive failing. A RAID5 array would be lost if that occured. That's why people usually recommend RAID6, but RAID6 only makes sense with 5+ drives.

Unrelated to parity:

  • Using a lot of small drives is very power-intensive and inefficient.
  • Whenever designing arrays you have to consider what you'll do in case of drive failure. Do you have a replacement on hand? Will you go out and buy another drive? How long will it take for it to reach you?
  • What about backups?
  • How much of your data is really essential and should be preserved at all costs?
[–] [email protected] 3 points 5 months ago* (last edited 5 months ago)

If you don't need realtime parity, I've had no issues on my media server running mismatched drives pooled via MergerFS with SnapRAID doing scheduled parity.

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
LVM (Linux) Logical Volume Manager for filesystem mapping
RAID Redundant Array of Independent Disks for mass storage
ZFS Solaris/Linux filesystem focusing on data integrity

3 acronyms in this thread; the most compressed thread commented on today has 17 acronyms.

[Thread #788 for this sub, first seen 5th Jun 2024, 22:35] [FAQ] [Full list] [Contact] [Source code]

[–] [email protected] 0 points 5 months ago

If you're doing weird shit, partition the 4TBs into 2x2TB, now you have 10x2TB. Or use unraid / mergerfs + SnapRaid.

[–] [email protected] 0 points 5 months ago

I do not see why it will cause any problems with exception of stacking mapping layer. I wonder can LVM do it natively without adding intermediate block device of 2 x 2G?