762
submitted 10 months ago by L4s@lemmy.world to c/technology@lemmy.world

OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling's Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

you are viewing a single comment's thread
view the rest of the comments
[-] TropicalDingdong@lemmy.world 114 points 10 months ago

Its a bit pedantic, but I'm not really sure I support this kind of extremist view of copyright and the scale of whats being interpreted as 'possessed' under the idea of copyright. Once an idea is communicated, it becomes a part of the collective consciousness. Different people interpret and build upon that idea in various ways, making it a dynamic entity that evolves beyond the original creator's intention. Its like issues with sampling beats or records in the early days of hiphop. Its like the very principal of an idea goes against this vision, more that, once you put something out into the commons, its irretrievable. Its not really yours any more once its been communicated. I think if you want to keep an idea truly yours, then you should keep it to yourself. Otherwise you are participating in a shared vision of the idea. You don't control how the idea is interpreted so its not really yours any more.

If thats ChatGPT or Public Enemy is neither here nor there to me. The idea that a work like Peter Pan is still possessed is such a very real but very silly obvious malady of this weirdly accepted but very extreme view of the ability to possess an idea.

[-] Bogasse@lemmy.world 14 points 10 months ago

Well, I'd consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is "they build original content", both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their "original content" is not derivated from copyrighted content 🤷

[-] TropicalDingdong@lemmy.world 1 points 10 months ago

Well, I’d consider agreeing if the LLMs were considered as a generic knowledge database. However I had the impression that the whole response from OpenAI & cie. to this copyright issue is “they build original content”, both for LLMs and stable diffusion models. Now that they started this line of defence I think that they are stuck with proving that their “original content” is not derivated from copyrighted content 🤷

Yeah I suppose that's on them.

load more comments (21 replies)
this post was submitted on 22 Aug 2023
762 points (95.7% liked)

Technology

55692 readers
3392 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS