this post was submitted on 10 Jul 2023
17 points (94.7% liked)

datahoarder

6754 readers
5 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS
 

I have between 20-30 TB of data I want to keep a copy of in a firesafe. I do not want to use an online storage solution, I want to maintain my personal data at my home.

My current plan is to get (2) Mediasonic HFR2-SU3S2 PRORAID enclosures and (8) WD Red Pro NAS 16TB drives to fill them. The first would contain a full backup and be placed in the safe. The second would be attached to my machine and receive nightly backups. Periodically, I would rotate the enclosures, taking the one from the safe and swap it with the one connected to my machine.

Are there any problems with my plan that I am not thinking of? Are there better solutions?

Is anyone else keeping a rotating data backup in a safe? How is it working out for you?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 3 points 1 year ago* (last edited 1 year ago) (4 children)

Sounds fine?

Yes: Treat the two enclosures independently and symmetrically, such that you can fully restore from either one (the only difference would be that the one in the safe is slightly stale) and the ongoing upkeep is just:

  1. Think: "Oh, it's been awhile since I did a swap" (or use a calendar or something)
  2. Unplug the drive at the computer.
  3. Cary it to the safe.
  4. Open the safe.
  5. Take the drive in the safe out.
  6. Put the other drive in the safe.
  7. Close the safe.
  8. Cary the other drive to the computer.
  9. Plug it in.
  10. (Maybe: authenticate for the drive encryption if you use normal full-disk encryption & don't cache the credential)

If I assume a normal incremental backup setup, both enclosures would have a full backup and a pile of incremental backups. For example, if swapped every three days:

Enclosure A        Enclosure B
-----------------  ---------------
a-full-2023-07-01
a-incr-2023-07-02
a-incr-2023-07-03
                   b-full-2023-07-04
                   b-incr-2023-07-05
                   b-incr-2023-07-06
a-incr-2023-07-07
a-incr-2023-07-08
a-incr-2023-07-09
                   b-incr-2023-07-10
                   b-incr-2023-07-11
                   b-incr-2023-07-12
a-incr-2023-07-13
....

The thing taking the backups need not even detect or care which enclosure is plugged in -- it just uses the last incremental on that enclosure to determine what's changed & needs to be included in the next incremental.

Nothing need care about the number or identity of enclosures: You could add a third if, for example, you found an offsite location you trust. Or when one of them eventually fails, you'd just start using a new one & everything would Just Work. Or, if you want to discard history (eg: to get back the storage space used by deleted files), you could just wipe one of them & let it automatically make a new full backup.

Are you asking for help with software? This could be as simple as dar and a shell script.

My personal preference is to tell the enclosure to not try any fancy RAID stuff & just present all the drives directly to the host, and then let the host do the RAID stuff (with lvm or zfs or whatever), but I understand opinions differ. I like knowing I can always use any other enclosure or just plug the drives in directly if/when the enclosure dies.

I notice you didn't mention encryption, maybe because that's obvious these days? There's an interesting choice here, though: You can do normal full-disk encryption, or you could encrypt the archives individually. Dar actually has an interesting feature here I haven't seen in any other backup tool: If you keep a small --aux file with the metadata needed for determining what will need to go in the next incremental, dar can encrypt the backup archives asymmetrically to a GPG key. This allows you to separate the capability of writing backups and the capability of reading backups. This is neat, but mostly unimportant because the backup is mostly just a copy of what's on the host. It comes into play only when accessing historical files that have been deleted on the host but are still recoverable from point-in-time restore from the incremental archives -- this becomes possible only with the private key, which is not used or needed by any of the backup automation, and so is not kept on the host. (You could also, of course, do both full-disk encryption and per-archive encryption if you want the neat separate-credential for deleted files trick and also don't want to leak metadata about when backups happen and how large the incremental archives are / how much changed.) (If you don't full-disk-encrypt the enclosure & rely only on the per-archive encryption, you'd want to keep the small --aux files on the host, not on the enclosure. The automation would need to keep one --aux file per enclosure, & for this narrow case, it would need to identify the enclosures to make sure it uses that enclosure's --aux file when generating the incremental archive.)

[โ€“] [email protected] 1 points 1 year ago

Thanks for pointing out dar, out definitely has some very interesting capabilities. My backup took off choice is BorgBackup as it makes local/remote, encrypted, deduplicated backups easy and allows for mounting previous snapshots via fuse fs for easy restore.

load more comments (3 replies)