this post was submitted on 24 Jul 2024
-2 points (46.2% liked)
Asklemmy
43803 readers
808 users here now
A loosely moderated place to ask open-ended questions
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- !lemmy411@lemmy.ca: a community for finding communities
~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yep, although the economics of that depend on what you're doing. I'm trying not to mention too many details, because internet hooliganism is one of the few things I think I could make worse just by publicly and accessibly explaining, haha.
I know of people with similar mechanisms who had problems with very sincere-sounding bad actors before ChatGPT. Best of luck with it, though. It's how I got into my instance.
Hey, unrelated, but do you know if they ever got the database code cleaned up? One of these days that's actually going to start to bite; my instance already had to do a hardware upgrade once.
I should try and figure out how a list of bad IPs would best fit into ActivityPub. It sounds like it would be easy enough to add.
It's been done, we can do it again!
There are many ChatGPT answers, but I think this more affects instances like Beehaw who ask for an essay and have to pick the AI out from the others. My instance has a short and specific question and works to weed out a lot of this, though I'm confident some spammers still get through (and are sitting on accounts waiting for them to age up a bit).
I'm not familiar with that specific code, but it probably depends on the last time you looked at it. In the early reddit migration days a lot of optimisation changes were made in a hurry, but there were issues that arose as instances scaled. These were patched up by various releases but on my instance the average CPU usage of the 0.19 versions is 30% or more up on the 0.18s.
Being in NZ we were also hit hard by the issue of federation being concurrent. To this day we are running an extra VM in Finland to batch up activities and send them in bulk to be replayed on the Lemmy server. I'm pretty sure I saw a pull request for that recently though so it might be fixed in the next version (but we'll have to wait until Lemmy.world updates if I understand it correctly).
Perhaps such a thing exists for Mastodon and could be applied to Lemmy?
Fascinating, I didn't realise the latency down there was that bad. How hard was it to get the process working across two distant servers like that?
Hmm, doesn't look like it. The relevant source doesn't mention anything, and a GitHub question from 2022 doesn't mention a devoted feature, although there's some publicly posted lists shared.
Lemmy servers don't send the next activity until the first is received. From memory it was something like 150-200ms for the round trip to Finland and back. That means a maximum of about 5 or 6 activities per second at the best of times. However, when Lemmy receives say a new comment, it then sends a request to retrieve the user details from the user's instance, and the whole pipeline is held up. The worst I saw was occasional activities taking 8 seconds to complete (I guess whatever data was being fetched was on a slow instance).
At one point, kbin.Social hammered Lemmy.world with duplicate requests which then tried to federate out, and that was when the problem was noticed (though Lemmy.world does average more than 5 a second so even after kbin issues stopped we couldn't recover). A guy on matrix Nothing4You (I'm not sure of Lemmy username ) built a pre-fetcher to trigger Lemmy to retrieve details of posts before Lemmy.world tried to federate them out, thus helping those situations where it was taking multiple seconds to retrieve all details. It helped but was not enough to turn the tide, and we were still getting further and further behind. Nothing4You was meanwhile building a complete batching solution, which you can see on github.
So for me? It was easy, I just signed up for a server and ran an ansible playbook to set it up, then added a docker container to the Lemmy stack, all the while getting personalised help 🙂. I'm not sure how hard it was to conceptualise a solution, build it, test it, and make sure it was fault tolerant, because I didn't have to!