It's A Digital Disease!

86 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 1 year ago
MODERATORS
51
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Soft-Ad4690 on 2024-10-28 16:03:45+00:00.


I've been looking into BD-Rs for long-term data backup, but I'm finding a lot of mixed information on whether current Blu-ray discs are HTL or LTH. There seem to be conflicting claims, and I want to clear things up:

  • This post claims HTL BD-Rs are almost nonexistent now (with the exception of M-Discs): link to post.
  • On the other hand, in another discussion, some users say that LTH discs generally don’t exceed 25GB capacity and are rare even at that size: link to post.

I'm looking to purchase BD-Rs to back up my data, but I'm not sure if they’re HTL or LTH, and there’s no mention of the type on the product page.

Does anyone have recent info on the availability of HTL/LTH BD-Rs, or tips on how to identify which type I'm buying?

52
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/jpablomsan on 2024-10-28 04:36:56+00:00.


I'm reading only horror stories so far from people that had to wait 10+ hours (often having issues midway) when turning off BL on HDDs of just 1 or 2 TB.

Anyone here by chance have any experience doing this on a 10+TB HDD? I have a 20TB external drive arriving in the next few weeks so waiting for it to unload the 10TB HDD is also an option but not my first choice really.

EDIT: I forgot to ask. If I reduce the data down to 50% capacity, do you think is going to take less time? Or it won't make any difference?

53
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/maxi1134 on 2024-10-27 18:31:25+00:00.


I have a large media collection and a hearing problem, this lead to an issue where I would not understand everything in the media I Consume.

Well, it seems like Bazarr is there to save me!

I have been using it for a little over 48 hours and it generated 1150 subtitles in the meantime.

Having tried Spanish, English, and French shows. I can say that they are about 90-95% accurate, which beats no subs at all for me that has hearing issues.

Complete info here!

Whisper could also be piped to generate subs for family video footage.

An example of the delay between generations:

54
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Jozac0522 on 2024-10-28 02:09:08+00:00.

55
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/GreenTeaBD on 2024-10-27 08:56:26+00:00.


Hi everyone, I wanted to bounce an idea off people and see how/if this could work.

I think we're starting to get close to the point where storage is cheap enough for individuals to archive copies of the old Internet as archived on the Wayback machine. Not soon-soon, but 5 to 10 years maybe? At least if we chop it up into a few chunks. I've been seeing those stories around here about people expecting capacity for hdds to really make some jumps soon, so who knows?

The wayback machine is huge, 99+PB. You look at their data and in 2023 they had 735 billion pages archived. Obviously there's no practical way for everyone to have this but you look at earlier years and the number is a lot smaller. In 2003 they had only 11 billion pages archived. This number jumps to 30 billion in 2004. That 2003/2004 point also seems like a good (though somewhat arbitrary) line to draw in the sand for "old internet" vs "new internet" (or at least "can be mirrored by a normal person maybe sometime soon" internet and "cant" internet) I might be wrong here but 2003/2004 feels like about the time everyone started getting broadband and the Internet changed drastically.

That's not the whole picture either, pre-broadband websites were much smaller. Low-res images, a whole lot less javascript and other stuff making the sites much smaller. Maybe 50KB to 100KB a page. They had to be, anything more was brutal over dialup. The Internet itself was a lot smaller, too.

So, we take 2003, 11 billion pages, assume 100KB a page (dangerous assumption but it's all the data I have to work with, this is a rough estimate) we can estimate that the total wayback machine archive for the old Internet is 1.1PB.

So, what do I want to do here? 1.1PB is still a lot, I'm at 120TB right now... But that feels reachable soon enough. I worry about the Internet Archive dying sometime, maybe not soon but in the future. Who knows what could happen. The old Internet is important to me, it's our digital heritage. It needs to be kept safe.

Does anyone think it would be possible to make this a shareable archive, in the future, so that the old internet can be downloaded as one big chunk, shared among everyone who feels like having it, and therefore be more safely preserved?

I think obviously it can, but the big problem is, would archive.org go along with this? I doubt they would be happy with me as just some guy blasting the whole archive and scraping everything from 96 to 2003 but if this is a coordinated project with the goal of further preservation in mind would they go along with it? I've seen some people associated with IA post around here so if they have any input I'd be interested in it, or if they could correct my estimates.

Would people even be interested in this? I am, but I'm an incredibly weird guy so who knows. I'm not thinking of this as a project to start now but we'll see where storage technology goes in the coming years.

I gotta admit, also I thought of this whole thing because I use theoldnet's proxy in my emulated 98se P100 install and thought it would be cool as hell to have a local mirror that's insanely fast, or just to poke through for hours/make more searchable.

56
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/imemine9876 on 2024-10-27 19:42:28+00:00.

57
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Electronic_Gur_183 on 2024-10-27 04:13:34+00:00.


enthusiast of online preservation, i recently stumbled upon this subreddit researching the IA hack and i've been hooked. i don't personally do any hoarding or archival myself but i am a true appreciator of it. it's interesting to see where the old software, games and magazines i used to download off the IA come from. and during my many trips to my local thrift stores, whenever something looks insanely obscure, niche, or generally weird and not something most people would care about, i always jokingly say to my brother "there is no way this is ANYWHERE on the internet." and i've always wondered if that statement were true. because i too think those things are generally weird, and don't care about them. so, i pose a question to ye data hoarders: is there anything you don't have uploaded to any publicly accessible archival site, or anything you have that you're pretty sure is not anywhere on the internet? and do you upload all of it? some of it? just the things you can't find anywhere on the internet? very curious to hear. and thank you all for what you do. i'd be fresh out of luck trying to gauge the average price of old computers by combing through catalog scans without the work of people like you, or potentially even you yourself!

edit: if there is anything in your collection you know for sure is unavailable online, do you plan on uploading it?

58
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/madcatzplayer5 on 2024-10-26 20:35:57+00:00.


Just thought it was interesting to think of each file in $ terms. A 700MB Divx AVI file alternatively costs a penny to store.

59
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Nillows on 2024-10-26 19:25:24+00:00.

60
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Rodnys_Danger666 on 2024-10-26 17:27:15+00:00.

61
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/AshleyUncia on 2024-10-26 14:30:32+00:00.

62
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Fearless-Team-2644 on 2024-10-26 05:32:34+00:00.

63
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/iVXsz on 2024-10-25 03:14:28+00:00.


It has happened... for a while now, a lot of older videos have had their VP9 streams removed and only have AVC streams. I randomly discoverd this while watching some older videos and wondering why the quality was extra bad, I went back to my archive, and guess what? the video looked a lot better, and then I found out vp9 got neutered on all older videos.

An approximate date is July 20th, by a report of a user on YT-DLP's Discord a day after it happened, yet it went under the rader and no one seems to have talked about this (afaik).

The issue is that the AVC streams are mostly garbage compared to the VP9 streams: it's so bad even tho both are about the same bitrate. I wish I knew about this sooner, out of all things I really didn't expect this from Youtube, seems pretty weird. I get that videos like these don't get much traffic but the channel has million of subs and people watch his older videos regularly, especially since he isn't as active nowadays.

1080p60 is affected as well, only av1 and avc remain. 1440p is not affected... yet.

64
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/lightnb11 on 2024-10-24 19:05:34+00:00.


Western Digital is refusing to honor the warranty on my Ultrastar Hard Drive. They claim it was stolen. I provided them with a receipt showing them that I am the original purchaser. I bought these drives on New Egg.

Has anyone else dealt with this?

This is the email I got from Western Digital after asking them why I could not register my drive's serial number online.

Dear [my name],

Thank you for providing your Proof of Purchase (POP).
Unfortunately, your product is not eligible for warranty update based on this receipt. 
We do honor warranty update from the purchase date, however, every proof of purchase submitted is subject for review and verified within our own records.

According to our records, serial number QGKAR3XT (MDL 0F31156) shows reported as lost or stolen and the warranty is void.
There is no warranty associated with the product.

Please note, pursuant to Western Digital’s Limited Warranty Policy (), Western Digital has no liability with respect to products that are not sold to you as new.

However, even if you bought the drive from Amazon, it was sold by a third party seller (PlatinumMicro) where there is no assurance how they acquired the drive that was sold to you. This is unfortunate risk buying from online third party sellers.
The warranty date remains and continues to be the current warranty period if there is any, even if the customer purchases one of these drives from a retailer and/or private party,

We recommend you contact your place of purchase for compensation or warranty support. You may use this email letter to demonstrate Western Digital’s official view of the product you purchased from your vendor.

We apologize for the inconvenience this issue may have caused you.
If you have any further questions, please reply to this email.

Sincerely,

Western Digital Service and Support

65
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/poynnnnn on 2024-10-24 15:07:46+00:00.


Right now, I am sharing a folder on my main PC. I have two other PCs with a lot of VMs accessing this folder, but the problem is that in Windows 11, only 20 users can access the shared folder. The folder contains my Visual Studio Code, which I use for machine learning and to store data in a database. Will NAS solve this issue? I've been looking for a solution and would love to hear some advice. Can I still run my code with NAS storage? I'm not sure how NAS works, but I'm doing my research at the moment.

66
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/GoodJohn14 on 2024-10-24 01:46:15+00:00.


My family was impacted by Hurricane Helene, and was left without internet or cell service for weeks. During this time there were many times where my family wanted to find general information but couldn't.

To be better prepared, I'm creating a few raspberry pi powered Kiwix devices to function as hotspots that anyone can connect to and get offline wikipedia and other useful information.

As I'd like these devices to last years in a closet (until needed), potentially unpowered, I need to know what storage medium is best to run the OS on. MicroSD, External HD, External SSD, etc. Any advice?

TLDR: Best storage for long term unpowered lifespan?

67
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/GinWhiskers on 2024-10-24 00:57:45+00:00.

68
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/wiener_dawg on 2024-10-22 23:36:23+00:00.

69
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/The_Bukkake_Ninja on 2024-10-22 21:17:06+00:00.


I have a large volume of backed up documents, photos and pst files that I have consolidated off random discs and portable HDDs over the years onto an Unraid server.

There are a lot of duplicate files on those drives. I’ve run a Czwaka analysis of the files using a Blake3 hash comparison. I followed this guide () on how to configure the comparison - I.e. checking a pre-hash of the first 2kb of the file to eliminate files that clearly aren’t duplicated, then doing a full hash comparison of the remainder.

My (probably dumb) question is - is there any chance that a file flagged as duplicate based on a full Blake3 hash comparison is, in fact, not a duplicate? My assumption is that this is basically mathematically impossible, but I wanted to check with people possessing greater expertise before I went and eliminated all but one copy.

Apologies if this has been fully answered in another thread - I’ve searched this subreddit, but with how bad Reddit search is I could have missed it.

70
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/a_shootin_star on 2024-10-22 17:58:19+00:00.

71
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/Devil-Eater24 on 2024-10-22 03:24:33+00:00.

72
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/ANameForThisShite on 2024-10-21 18:24:03+00:00.

73
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/elgato123 on 2024-10-21 04:38:07+00:00.


Looking to find cloud storage for permanent backup, archiving that would only be accessed in the event of a complete disaster. I don’t really care what the restore cost would be because in the event that we have such a big data loss disaster, insurance would probably kick in and pay that cost. Just looking for the cheapest monthly storage. As far as I can tell, AWS deep archive seems to be the cheapest.

74
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/TranscendentalLove on 2024-10-21 04:40:43+00:00.


...this music was ONLY on the internet archive. It wasn't on Spotify/Apple/Tidal/Deezer/Qobuz/Amazon; It wasn't on private torrenting trackers like OiNK/What/Waffles/RED/OPS; it wasn't on Usenet/Soulseek/public torrenting; it wasn't even on YouTube/Facebook/Instagram/TikTok; it wasn't available in stores; it sometimes wasn't even CATALOGUED on MusicBrainz/Discogs/Wikipedia.

I'm talking about hand-ripped 78s that were ripped in like 10 different ways and then using audiological knowledge determined what the best rip was for the end-user.

I actually HAVE some of these, but I am finding that I didn't write down any metadata and there is NO information on the years, artist, context, b-sides, label, etc ANYWHERE, let alone a copy.

I'm well-aware of the breadth and depth of rare music. I'm aware of obscure demos; 60s and 70s Vinyl-only pressings that were never remastered or re-released on CD; I'm aware of limited run stuff...

...NONE of that compares to music from the 1910s-1930s and how much of it was archived on the internet archive. I'm talking B-Sides and everything. EVEN THEN, they wouldn't have everything, but they had so much.

I'm a young man -- this music isn't my forte -- it became an acquired taste, like all music I now understand. So I am very intrigued and interested and love compiling and even listening to it, but I'm not in the position to truly be motivated to archive all this music like it deserves to. Yet even with my proximity to it, it sometimes feels like I'm the only one who even knows it exists.

Some of these songs are the original recordings of songs everyone knows today as standards; ballads. Some of these songs led to entire genres being formed. Some of these songs feature now-extinct sensibilities and lyrics that are just truly a delight to experience.

I miss the internet archive and I want it back. I have a slew of music I would like to cross-reference; I have many more songs and b-sides from the top (now Billboard then something else) charts of the 20s-40s I want to explore.

It's hard to not feel like this is symbolic of where we are at as a world. It feels a bit eerie knowing this is happening, as if society is decaying in real-time around-us. I hope it's back online soon.

75
 
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/datahoarder by /u/lilmatarte on 2024-10-20 23:01:12+00:00.

view more: ‹ prev next ›