this post was submitted on 18 Oct 2024
49 points (84.5% liked)

Fediverse

28395 readers
214 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS
 

I made a robot moderator. It models trust flow through a network that's made of voting patterns, and detects people and posts/comments that are accumulating a large amount of "negative trust," so to speak.

In its current form, it is supposed to run autonomously. In practice, I have to step in and fix some of its boo-boos when it makes them, which happens sometimes but not very often.

I think it's working well enough at this point that I'd like to experiment with a mode where it can form an assistant to an existing moderation team, instead of taking its own actions. I'm thinking about making it auto-report suspect comments, instead of autonomously deleting them. There are other modes that might be useful, but that might be a good place to start out. Is anyone interested in trying the experiment in one of your communities? I'm pretty confident that at this point it can ease moderation load without causing many problems.

!santabot@slrpnk.net

you are viewing a single comment's thread
view the rest of the comments
[–] auk@slrpnk.net 2 points 1 month ago

You can do that now, and evade human moderation in the same way.

I don't want you to give it a try in the Santa communities, even though it would be a badly-needed test of the system. The code that's supposed to detect and react to that doesn't get much action. Mostly it's been misfiring on the innocent case, and attacking innocent people because they're new and they said one wrong thing one day. I think I fixed that, but it would be nice to test it in the other case, with some participation that I know is badly intended, and make sure it's still capable of reacting and nuking the comments.

But no, please don't. The remedy for that kind of thing is for admins to have to do work to find and ban you at the source, or look at banning VPNs or something which is sad for other reasons, so I don't want that. Just leave it until real bad people do it for real, and then me and the admins will have to work out how to get rid of them when it happens.