this post was submitted on 04 Jul 2023
22 points (100.0% liked)

No Stupid Questions

35311 readers
1489 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago
MODERATORS
 

The tech giants make enough money that they could keep on growing forever, from my understanding.

But the fediverse? Sure the main instances that get enough funding are going to be okay, but what about the single-user instances 10 years from now on when there's a lot more content to download? Won't they go bankrupt just by trying to annex the big instances?

And I have the impression that the lemmy giants are going to change over time: does that mean that 50 years from now on, the posts I'm posting here today might get lost in time because the instances that annex it will have shut down by then?

I probably misunderstand how the fediverse works, but my worry is that the small instances won't be able to hold an ever-growing amount of data forever.

I spoke in absolutes for the sake of readability, but I'm as in-the-dark as can be.

top 23 comments
sorted by: hot top controversial new old
[–] wyzewyz@programming.dev 11 points 1 year ago (1 children)

I probably misunderstand how the fediverse works, but my worry is that the small instances won’t be able to hold an ever-growing amount of data forever.

Let's pretend you run a small Lemmy instance (~100 users).

If you federate with a large instance, you (i.e. your instance) will only receive new posts from communities that your users subscribe to, or users that your users follow [1]. These are deduplicated, in the sense that if all 100 of your users subscribe to the same community, you only need to download and store one copy of that community's posts in your database.

[1] AFAICT. The current implementation of Lemmy seems to handle federation using the activitypub_federation crate. I skimmed the docs of that crate, but they aren't 100% clear about this.

the posts I’m posting here today might get lost in time because the instances that annex it will have shut down by then?

You have the same problem with any data you put online anywhere: The people currently keeping your stuff online might delete it anytime they decide it's not worth the trouble to keep it online.

If it's important to you that certain information stays online, keep a copy on a disk in your house; check back periodically to be sure it's still online, and if it's not, you can always use the copy in your house to put it online again somewhere else. If it's very important to you, keep multiple copies on multiple disks hosted by multiple companies on different continents.

50 years from now on

Predicting what will happen in tech in 50 years is a pretty daunting challenge.

50 years ago, in 1973, all the computers on the ARPAnet (the predecessor of the Internet) could be easily listed on a single piece of paper. The home computer was still years from birth. The Zilog Z80, Intel 8080, Motorola 6500 and the MOS Technology 6502, which would play key roles in early home computers and gaming consoles, were just beginning to enter the market.

[–] Merulox@lemmy.world 1 points 1 year ago

All the answers I got were very useful and informative, but this one is definitely the one that catered the most to my worries.

[–] solidgrue@lemmy.world 9 points 1 year ago* (last edited 1 year ago) (1 children)

Mostly serious answer: the current implementation is not going to scale effectively with growth. The software implementation is still rough around the edges, and the ActivityPub protocol probably needs more knobs to handle bulk data synchronization. Within the service, moderaton is a serious challenge with many unanswered questions.

Likewise, the back end software implementation is monolithic, meaning it's one software stack that does everything from sign in to subscriptions to synchronization and scheduling. Housekeeping and garbage collection probably isn't that tight, either. This is mostly speculation as I've watched things over the last couple of weeks' growth.

I believe the data store is based on Postgres RDBMS, which while being robust and scalable is fussy and needs tuning when turning over large amounts of highly unique data.

None of this is an indictment on the devs! Rather the opposite, because the software IS chugging along while experiencing tremendous growth.

I expect over time the back end will devolve into micro services that communicate over a highly scalable, or stream-based messaging bus. Larger instances could probably also benefit from static caching and CDN techniques to keep pages loading quickly even while the back end thrashes.

The structure.if the ecosystem needs to strike a balance between fewer large instances and many-many small instances. In the first scenario, the scaling limit is in the monolithic stack, which introduces I/O bottlenecks and serialization delays (even if massively threaded). In the latter scenario, message state and synchronous distribution become challenging because a full mesh of federations could scale faster than network state tables have room to support. Some middle tier might be needed, and I have no idea what that might even look like.

So to answer your question, can it scale indefinitely? Probably not because we hit scaling limits pretty quickly on a number of dimensions. Nevertheless, smart people.are starting to hang out here, and I expect will take an interest in how it all works. Improvement is inevitable, and I think the early roadblocks will be overcome easily enough

Edit to add: I'm a systems engineer in my day job but I work adjacent to the applications teams. The preceding commentary is just (un-)educated guesswork on my part.

[–] paholg@lemmy.one 2 points 1 year ago (1 children)

There's nothing wrong with a monolith. Microservices are not inherently more scalable. Their advantage is around scaling teams. If anything, a monolith can be more performant as in-process calls are much faster thent network calls.

[–] solidgrue@lemmy.world 0 points 1 year ago (1 children)

There can be better efficiencies by disaggregating the full stack into microservices and making IPC calls among scalable workers versus strictly service-per-server models which, yes, incur scaling issues from network iowait. Modern network operating systems do this, which allows heavier loaded processes more access to resources while lesser loaded processes are deferred.

[–] paholg@lemmy.one 2 points 1 year ago (1 children)

I'm not sure what you mean by a "network operating system", but monoliths are inherently just as scaleable as services.

Imagine you have a service architecture, and you are running 2 of service A, 4 of service B, and 8 of service C.

Alternatively, you could be running a monolith on 14 nodes. Most of the work those 14 nodes will be doing work that would have been covered by service C, it's just spread out in a different way.

[–] solidgrue@lemmy.world 1 points 1 year ago

I'm talking about Cisco IOS-XR, Juniper JunOS, Arista.EOS and others.

Those operating systems are disaggregated, meaning different features can be restarted, replicated, scaled out horizontally, or upgraded without having to disturb the other components in runtime.

Maybe we're getting at the same point from other ends. I'm not a traditional software engineer,but ai have had academic and professional training on these topics.

[–] Candelestine@lemmy.world 3 points 1 year ago (1 children)

No, after a sufficient amount of time has passed, we would run out of useable matter and energy in the universe. This theorized end-state of heat death puts a finite cap on the size of the Fediverse.

Constrained to Earth, it'd probably be fine. Though I do see it splintering eventually, with sub-communities existing independently from the main organism.

[–] BlushedPotatoPlayers@terefere.eu 0 points 1 year ago (2 children)

But would it work with spherical servers in vacuum?

[–] Miqo@lemmy.world 1 points 1 year ago (1 children)

Time to invent the Dyson Server!

[–] Scribbd@feddit.nl 1 points 1 year ago

We already got one: the dyson.com server. /jk

[–] Miqo@lemmy.world 1 points 1 year ago

Time to invent the Dyson Server!

[–] marsokod@lemmy.world 3 points 1 year ago (1 children)

The world produces 15Mt of beans every year. The average shit post with beans has 700g of beans in it. This means Lemy can scale to around 22 billions shitposts/year. We have some margin.

#shitpost

[–] Bearded_Baguette@lemmy.world 0 points 1 year ago (1 children)

This math checks out. I ran it through the bean calculator using OpenBeanAI. 32.33% of the simulations show these numbers.

[–] r00ty@kbin.life 2 points 1 year ago (1 children)
[–] WarmSoda@lemm.ee 1 points 1 year ago
[–] ShellMonkey@lemmy.socdojo.com 1 points 1 year ago (2 children)

I guessy answer is, who cares? Don't treat a social media account as some immortal time capsule of your life. Keep a photo album, write some diary entries, but don't rely on any form of social media to be the historical record of your existance. If it's inportant keep it somewhere you can ensure the preservation.

I'm pretty sure the world will continue long after we've forgotten beans and not pooping for X days.

[–] PeachMan@lemmy.one 1 points 1 year ago

I think people need to be reminded of two big things when it comes to Lemmy:

  1. It is impermanent. Not intentionally, I'm sure most instances will try to keep all the posts for as long as possible. But we're just hosting this stuff on independent servers (also known as "somebody else's computer") and we can't rely on them to stay online forever.

  2. Lemmy is NOT PRIVATE. You cannot delete your posts, and this is by design. You can edit them, but there's an edit history, and even if there wasn't, it would be impossible to ensure that the old versions of your posts aren't stored on some random, rarely used instance. There is no big man in charge like Mark Zuckerberg that you can sue to delete your data. If you want to use Lemmy privately, DON'T POST YOUR PERSONAL INFO. Don't post things that can be used to identify you. This is a public forum. Treat it like one. If you don't like that, go somewhere else.

Sorry, #2 is kinda off topic, but I see a lot of confusion about what Lemmy is and isn't.

[–] Merulox@lemmy.world 0 points 1 year ago (1 children)

I needed to be reminded of this, thanks.

Still, Reddit is probably the biggest and most accessible source of information in the world, written out of passion by people, experts, professors, neckbeards... trolls... uni students, researchers,

and I wish Lemmy could also become the archive that Reddit is, but if information has a high likelihood to get lost with time, why bother? It should then really only be treated as a very temporary social media which is... okay, I guess.

[–] NatureBoyFlickRair@lemmy.fmhy.ml 0 points 1 year ago (1 children)

Everything is temporary. Nothing is permanent. Embrace it and live in the now.

[–] Flemmy@lemmy.world 2 points 1 year ago

It's weird to think about, but data has a shelf life. Software needs to grow and be pruned regularly, or it dies.

Social media is both - the data dump is useless without an ecosystem of tools around it, and if the data itself stops interacting with the zeitgeist of the parent society, it basically becomes an old journal. It's interesting to a very specific group of people, and literally no one else wants to see it (aside from a few gems picked out and cleaned up for public consumption)

At any point we could go back to Reddits explosion after the digg migration. We could pull up posts that mirror exactly what's happening now. It'd be interesting for sure, and there's days of then-now posts that people could be making...but instead we just have people telling us about their memories of that process.

Why? Because that data is old and stale. You'd have to hunt it down with tools not intended for it, filter out the best of it, fix broken links, and probably put it through a slur filter

[–] BlameThePeacock@lemmy.ca 1 points 1 year ago

Smaller instances don't grab everything from every other server, it only grabs data from other servers when their users are subscribed to specific communities, also I suspect it doesn't grab all historical data automatically (though I don't know how much it does grab by default)

Right now there's no migration tool for when instances shut down, but it should be technically possible someone just needs to implement it.

[–] blueshades@lemmy.world 1 points 1 year ago

The Fediverse needs to encourage different instances. It’s the only way it can work. It has the technical framework to do it and for it to be transparent to the enduser but I feel like it’s not there yet.

For example I think users should be strongly encourages to chose regional instances instead of lemmy.world (I know know, ironic coming from me). It should be default and require the user to go out of their way to select a different instance. It should also be concisely explained that your instance doesn’t matter and that you can see any other federated instance. Yes, this is not always true but it doesn’t matter to someone just joining. Let them get here first and then they’ll naturally learn about the intricacies. Don’t scare them away at the gates.

load more comments
view more: next ›