this post was submitted on 27 Nov 2024
33 points (94.6% liked)

Fuck AI

1443 readers
158 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 8 months ago
MODERATORS
 

Bluesky may have said it won't use user data to train generative AI, but someone else just published a dataset of million Bluesky posts for "machine learning research". Already very popular dataset, your data may be scraped

Without paywall

top 5 comments
sorted by: hot top controversial new old
[–] hexagonwin@lemmy.sdf.org 3 points 6 hours ago

tbh this can happen with everything now so..

i'm not sure what would be the solution, sadly.

[–] KurtVonnegut@mander.xyz 5 points 8 hours ago (2 children)

The same can and will happen with the Fediverse right?

[–] GeneralEmergency@lemmy.world 4 points 8 hours ago

Probably already happened

[–] Viking_Hippie@lemmy.world 2 points 7 hours ago (1 children)

Probably not. An enormous amount of publicly availablr data on a single instance, like with bluesky, is an AI scraper's wet dream.

The fediverse, in contrast, has much fewer people spread around perhaps HUNDREDS of instances. That's a much less appealing effort to reward ratio for the scrapers..

[–] KurtVonnegut@mander.xyz 6 points 7 hours ago

I see. Probably mastodon.social gets scraped, then 🫣