this post was submitted on 26 Jul 2023
900 points (100.0% liked)

Technology

58017 readers
3893 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Thousands of authors demand payment from AI companies for use of copyrighted works::Thousands of published authors are requesting payment from tech companies for the use of their copyrighted works in training artificial intelligence tools, marking the latest intellectual property critique to target AI development.

you are viewing a single comment's thread
view the rest of the comments
[–] squaresinger@feddit.de 77 points 1 year ago (3 children)

Well, if you ask e.g. ChatGPT for the lyrics to a song or page after page of a book, and it spits them out 1:1 correct, you could assume that it must have had access to the original.

[–] dojan@lemmy.world 34 points 1 year ago (1 children)

Or at least excerpts from it. But even then, it's one thing for a person to put up a quote from their favourite book on their blog, and a completely different thing for a private company to use that data to train a model, and then sell it.

[–] chakan2@lemmy.world 15 points 1 year ago (2 children)

you could assume that it must have had access to the original.

I don't know if that's true. If Google grabs that book from a pirate site. Then publishes the work as search results. ChatGPT grabs the work from Google results and cobbles it back together as the original.

Who's at fault?

I don't think it's a straight forward ChatGPT can reproduce the work therefore it stole it.

[–] squaresinger@feddit.de 11 points 1 year ago (1 children)

Copyright doesn't work like that. Say I sell you the rights to Thriller by Michael Jackson. You might not know that I don't have the rights. But even if you bought the rights from me, whoever actually has the rights is totally in their legal right to sue you, because you never actually purchased any rights.

So if ChatGPT ripps it off Google who ripped it off a pirate site, then everyone in that chain who reproduced copyrighted works without permission from the copyright owners is liable for the damages caused by their unpermitted reproduction.

It's literally the same as downloading something from a pirate site doesn't make it legal, just because someone ripped it before you.

[–] Rodeo@lemmy.ca 4 points 1 year ago (1 children)

That's a terrible example because under copyright law downloading a pirated thing isn't actually illegal. It's the distribution that is illegal (uploading).

[–] squaresinger@feddit.de 2 points 1 year ago

Yes, downloading is illegal, and the media is still an illegally obtained copy. It's just never prosecuted, because the damages are miniscule if you just download. They can only fine you for the amount of damages you caused by violating the copyright.

If you upload to 10k people, they can claim that everyone of them would have paid for it, so the damages are (if one copy is worth €30) ~€300k. That's a lot of money and totally worth the lawsuit.

On the other hand, if you just download, the damages are just the value of one copy (in this case €30). That's so miniscule, that even having a lawyer write a letter is more expensive.


But that's totally besides the point. OpenAI didn't just download, they replicate. Which is causing massive damages, especially to the original artists, which in many cases are now not hired any more, since ChatGPT replaces them.

[–] ProfessorZhu@lemmy.world 9 points 1 year ago (1 children)

Can it recreate anything 1:1? When both my wife and I tried to get them to do that they would refuse, and if pushed they would fail horribly.

[–] squaresinger@feddit.de 10 points 1 year ago (2 children)

This is what I got. Looks pretty 1:1 for me.

[–] jackie_jormp_jomp@lemmy.world 11 points 1 year ago (1 children)

Hilarious that it started with just "Buddy", like you'd be happy with only the first word.

[–] squaresinger@feddit.de 7 points 1 year ago* (last edited 1 year ago)

Yeah, for some reason it does that a lot when I ask it for copyrighted stuff.

As if it knew it wasn't supposed to output that.

[–] Cheems@lemmy.world 6 points 1 year ago (1 children)

To be fair you'd get the same result easier by just googling "we will rock you lyrics"

How is chatgpt knowing the lyrics to that song different from a website that just tells you the lyrics of the song?

[–] squaresinger@feddit.de 6 points 1 year ago

Two points:

  • Google spitting out the lyrics isn't ok from a copyright standpoint either. The reason why songwriters/singers/music companies don't sue people who publish lyrics (even though they totally could) is because no damages. They sell music, so the lyrics being published for free doesn't hurt their music business and it also doesn't hurt their songwriting business. Other types of copyright infringement that musicians/music companies care about are heavily policed, also on Google.

  • Content generation AI has a different use case, and it could totally hurt both of these businesses. My test from above that got it to spit out the lyrics verbatim shows, that the AI did indeed use copyrighted works for it's training. Now I can ask GPT to generate lyrics in the style of Queen, and it will basically perform the song texter's job. This can easily be done on a commercial scale, replacing the very human that has written these song texts. Now take this a step further and take a voice-generating AI (of which there are many), which was similarly trained on copyrighted audio samples of Freddie Mercury. Then add to the mix a music-generating AI, also fed with works of Queen, and now you have a machine capable of generating fake Queen songs based directly on Queen's works. You can do the very same with other types of media as well.

And this is where the real conflict comes from.