this post was submitted on 15 Feb 2024
82 points (100.0% liked)

Technology

37712 readers
216 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Max_P@lemmy.max-p.me 13 points 8 months ago (1 children)

No but if they forget to strip those before training the models, it's gonna start spitting out licenses everywhere, making it annoying for AI companies.

It's so easily fixed with a simple regex though, it's not that useful. But poisoning the data is theoretically possible.

[–] t3rmit3@beehaw.org 1 points 8 months ago

Only if enough people were doing this to constitute an algorithmically-reducible behavior.

If you could get everyone who mentions a specific word or subject to put a CC license in their comment, then an ML model trained on those comments would likely output the license name when that subject was mentioned, but they don't just randomly insert strings they've seen, without context.