Post where I noticed it: https://hexbear.net/post/3239878
Normal lemmy-ui hexbear:
The lite version of lemmy at next.hexbear.net apparently already proxies thumbnails:
Now that the old Hexbear fork has been officially abandoned, this community will be used as a space for meta-discussion on the site itself.
Post where I noticed it: https://hexbear.net/post/3239878
Normal lemmy-ui hexbear:
The lite version of lemmy at next.hexbear.net apparently already proxies thumbnails:
Unfortunately I'm not that familiar with the Lemmy codebase, but how possible would it be to proxy+cache thumbnails?
As far as I know, there is no infrastructure for this at all. As an alternative, Hexbear prevents embedding remotely-hosted images in comments/posts, but as you have pointed out, this does not seem to be a 100% foolproof measure. There are other Fediverse platforms which do cache all remote media, but this comes at a cost. Matapacos.dog, an instance of about 100 active users, is currently storing 2.3 TB of cached media. (There's a much smaller 48 hour hot cache on the VPS, this is the 1 month (IIRC) secondary cache which lives in an S3-compatible object storage bucket).
I'm beginning to think something is preventing this cache from being cleaned periodically though lmao.
mastodon@matapacos:~/live$ bin/tootctl media remove --dry-run --days 30
2625433/2625777 |============================================== | ETA: ??:??:??
Removed 2625766 media attachments (approx. 1.7 TB) (DRY RUN)
:yea:
What's your object store? I have a couple projects that would be easier with something S3-compatible, but I would love to not give Jeffrey Kisses any money for the privilege.
Interesting stats! Yeah, it sounds like storing everything remote that ever touched hexbear is going to balloon storage costs. I guess if there's no infrastructure for it then it just waits for something to be offered for it by the Lemmy devs. Maybe thumbnails are less important and could just be proxied without long term storage, and if the source image dies, well it's just a thumbnail.
IMO the cost is immaterial. It is a design decision. Less caching makes the start-up cost of instances lower. More caching makes the instances safer to use, but increases the requirements of running one. Either choice could be correct depending on the context.
Yeah, it sounds like storing everything remote that ever touched hexbear is going to balloon storage costs.
'Balloon' in this case is still manageable. In my case, this storage still costs less than $20 a month, (the minimum is about $8/mo for 1TB or less). If you're paying Amazon prices you're fucked though.