this post was submitted on 01 Sep 2023
13 points (93.3% liked)

datahoarder

6526 readers
31 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS
13
ZFS backup strategy (lemmy.sdfeu.org)
submitted 11 months ago* (last edited 11 months ago) by biscuits@lemmy.sdfeu.org to c/datahoarder@lemmy.ml
 

Hello,

I've been lately thinking about my backup strategy as I'm finalising building my NAS. I want to use ZFS and my idea was to have two drives in mirror (RAID-1) configuration and just execute periodical snapshots on such dataset. I want to the same thing in a second location, so in the end my files would be on 4 different drives in 2 different locations and protected by snapshots from deletion or any other unwanted modification.

Would be possible with this setup to just swap one of the drives in one location and have ZFS automatically rebuild data on the new drive and then I take the drive to second location and do the same so all drives would be exactly the same, instead of copying data manually? Though I believe all of the drives would need to be exactly the same size, is that right?

Is it a good idea in general or should I ditch it, or maybe just ditch the part with ZFS rebuilding and use instead some kind of software for that?

Thank you for your help in advance!

you are viewing a single comment's thread
view the rest of the comments
[–] PriorProject@lemmy.world 1 points 11 months ago* (last edited 11 months ago) (3 children)

I don't know if what you're suggesting is possible, which as I read it is to split your "live" raid-1 in half and use one drive to rebuild the "live" pool and the other drive to rebuild the "backups" pool. It might be, but I can't think of any advantage to that approach and it's not something I would have thought to attempt.

I'd do one of:

  • Ship the data over the network using ZFS send or something like syncoid/sanoid (which use ZFS send under the hood). It might be slow, but is that an issue? Waiting a week for the initial sync might be fine.
  • But syncing by sneakernet is a good strategy too, and can be faster if your backup site is close or your connectivity is slow. In this case, I'd build the backup pool at the live site... ideally in an external drive bay... but one could plug it in internally as well. Then sync them with a local ZFS send, export the backup pool, detach and transport the backup pool to the backup site, them reattach the backup pool at the backup site and import it. Et Voila, the backup pool is running at the remote site fully populated with data and subsequent ZFS sends will be incremental.

Splitting and rebuilding your live pool might be possible, but I can imagine a lot of that might go wrong and I can't see any reason to do it that way over export/import.

[–] biscuits@lemmy.sdfeu.org 1 points 11 months ago* (last edited 11 months ago) (1 children)

Thanks, I guess it's even better solution and doesn't involve kinda risky removing drives from pool. But do you think my strategy with snapshots as backup is good overall or should I use something else?

[–] PriorProject@lemmy.world 2 points 11 months ago (1 children)

Yeah, snapshots sent to a separate and often remote pool is an extremely common backup strategy for folks who have long-term settled on ZFS. There's very nice tooling for this that presents a more traditional schedule/retention based interface to save you scripting snapshots and sends directly.

  • Sanoid is an old standby in that space.
  • Zrepl is getting a lot of traction lately and seems to be an up-and-coming option.
  • I use pyznap, but I don't recommend it to others as as the maintainer is on a multi-year hiatus which makes it undermaintained. It works great, but isn't getting active development which makes it a poor bet in a crowded space with many great options. I plan to eval Zrepl when I get around to it.
[–] Nogami@lemmy.world 1 points 11 months ago

Sanoid works great. Very easy setup and no issues.

load more comments (1 replies)