Technology

40221 readers

249 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

153

Wikipedia Editors Adopt ‘Speedy Deletion’ Policy for AI Slop Articles | 404 Media (www.404media.co)

submitted 1 month ago by theangriestbird@beehaw.org to c/technology@beehaw.org

7 comments fedilink hide all child comments

top 7 comments

sorted by: hot top controversial new old

[–] theangriestbird@beehaw.org 57 points 1 month ago (2 children)

I think the how is the most interesting part here.

The solution Wikipedians came up with is to allow the speedy deletion of clearly AI-generated articles that broadly meet two conditions. The first is if the article includes “communication intended for the user.” This refers to language in the article that is clearly an LLM responding to a user prompt, like "Here is your Wikipedia article on…,” “Up to my last training update …,” and "as a large language model.” This is a clear tell that the article was generated by an LLM, and a method we’ve previously used to identify AI-generated social media posts and scientific papers.

The other condition that would make an AI-generated article eligible for speedy deletion is if its citations are clearly wrong, another type of error LLMs are prone to. This can include both the inclusion of external links for books, articles, or scientific papers that don’t exist and don’t resolve, or links that lead to completely unrelated content. Wikipedia's new policy gives the example of “a paper on a beetle species being cited for a computer science article.”

[–] hansolo 8 points 1 month ago

JHFC

[–] jarfil@beehaw.org 7 points 1 month ago* (last edited 1 month ago) (1 children)

Sounds fair. Only issue might be... that creating an automated cleanup tool to remove those triggers, wouldn't be all that difficult.

[–] ranandtoldthat@beehaw.org 18 points 1 month ago (1 children)

Speedy deletion is for deletions that require zero discussion, so it needs to be very simple and clear. For less sloppy genai there may need to be a discussion (unless it falls under different speedy deletion criteria.

Sometimes those discussions are very straightforward, but they allow for dissenting voices. But for "almost obvious" cases not a lot of effort is spent on them.

[–] jarfil@beehaw.org 4 points 1 month ago

Of course. I also hope this will stop like 99% of the skiddie spam. I'm just afraid that, like it has happened with hacking in general, a noob installing Kali will get a ton of one-click ways to bypass these measures... and then, what's next?

Genai inserting watermarking would be great, but that's hard to do with text, in any way that isn't easily removed.

[–] kehet@sopuli.xyz 12 points 1 month ago (1 children)

I don't understand why ppl do this. Is this malicious? Do they think they are somehow helping?

[–] smeg@feddit.uk 14 points 1 month ago* (last edited 1 month ago)

I saw a similar story about how an open source software project (I think it was curl) have cancelled their bug bounty programme because it's being overrun with LLM-generated reports and they don't have enough volunteers to verify them all. The relevant bit is that while many were doing it for the financial reward, some do it for reputation and some genuinely do think they're helping by adding info they think is missing but not realising that what they're posting is unreliable.