17
Digital Trash Heap (lemmy.world)

So here’s the problem that I have, I have several generations of back ups, which are currently taking over huge amounts of space on my NAS server. I want to be able to go through and process all of the files that are on it while the duplicating, and possibly going through and tagging any files that I find that are helpful. Is anyone aware of a good tool to help accomplish this task. Again because of the nature of the backups, I don’t want to utilize any software I’m not running locally.

Thanks in advance.

top 7 comments
sorted by: hot top controversial new old
[-] paradox2011@lemmy.ml 7 points 8 months ago

How are your backups currently stored, simple copies of the files like you would make with rsync? I assume your on a Linux NAS, in which case fdupes would likely fit the bill. meld would be another option, and it also has a GUI if your NAS isn't headless.

For future backups restic might be a nice option as it deduplicates itself each time you run the backup. You can set retention policies (i.e. 7 daily, 4 weekly, 2 monthly, etc...) that only keep regulated intervals of backups.

[-] witten@lemmy.world 1 points 8 months ago

Borg Backup would also fit the bill for backups going forward, especially if OP is still backing up to a local server (as opposed to cloud object storage).

[-] paradox2011@lemmy.ml 1 points 8 months ago

I haven't tried Borg, but have noticed it mentioned pretty often in data hoarder forums. What do you like about it?

[-] witten@lemmy.world 3 points 8 months ago

It deduplicates aggressively at the block level. So if your files don't change much, each additional backup takes very little space. And if a file changes a little, Borg only backs up what's changed instead of the whole file again.

Borg also has a rich ecosystem of wrappers and tools (borgmatic, Vorta, etc.) that extend its functionality and make it easier to use.

[-] paradox2011@lemmy.ml 2 points 8 months ago

Interesting, sounds like it's worth checking out. Plus as a star trek fan, I approve of the name 😄

[-] Rootiest@lemmy.world 1 points 8 months ago

I like Kopia, similar feature set to Borg but I prefer its UI

[-] SheeEttin@lemmy.world 5 points 8 months ago

The only thing I can think of is to do a restore of all the backups in sequence, assuming they're all of the same thing. That would give you one consolidated image. Then you could run some deduplication and take a new single backup, if desired.

But really it's so subjective that I don't think there's really any way to automate it. I would mount all the backups, go through everything, pick out what you want to keep, and delete the rest.

Look at it this way. If you've had the backup for years, and never needed to restore any of those files, how likely are you ever need them in the future? Even if you did delete something you later wanted, how life-threatening would it be to not have it?

Or you could take the easy way out and just add more storage.

this post was submitted on 25 Oct 2023
17 points (90.5% liked)

Selfhosted

37715 readers
356 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS