this post was submitted on 09 Jul 2023
76 points (100.0% liked)

Technology

19 readers
2 users here now

This magazine is dedicated to discussions on the latest developments, trends, and innovations in the world of technology. Whether you are a tech enthusiast, a developer, or simply curious about the latest gadgets and software, this is the place for you. Here you can share your knowledge, ask questions, and engage in discussions on topics such as artificial intelligence, robotics, cloud computing, cybersecurity, and more. From the impact of technology on society to the ethical considerations of new technologies, this category covers a wide range of topics related to technology. Join the conversation and let's explore the ever-evolving world of technology together!

founded 1 year ago
 

Comedian and author Sarah Silverman, as well as authors Christopher Golden and Richard Kadrey — are suing OpenAI and Meta each in a US District Court over dual claims of copyright infringement.

you are viewing a single comment's thread
view the rest of the comments
[–] sky@lemmy.codesink.io 29 points 1 year ago (21 children)

Interested to see how this plays out! Their argument that the only way a LLM could summarize their book is by ingesting the full copyrighted work seems a bit suspect, as it could've ingested plenty of reviews and summaries written by humans and combined that information.

I'm not confident that they'll be able to prove OpenAI or Meta infringed copyright, just as i'm not confident they'll be able to prove that they didn't violate copyright. I don't know if anyone really knows what these things are trained on.

We got to where we are now with fair use in search and online commentary because of a ton of lawsuits setting precedent, not surprising we'll have to do the same with machine learning.

[–] FaceDeer@kbin.social 15 points 1 year ago (12 children)

Even if they did train the model on the entire text of the book, that's still not necessarily copyright violation. I would think not, since the resulting model doesn't actually have a copy of the book embedded within it.

[–] Harlan_Cloverseed@kbin.social 1 points 1 year ago (2 children)
[–] FaceDeer@kbin.social 9 points 1 year ago* (last edited 1 year ago)

How do we "know" anything where the answers are just being made up as part of humanity's collective cultural game of Calvinball?

Courts in various jurisdictions will make various rulings. Judges will interpret them in various ways. Legislators will chime in with new legislation and new treaties. Internet arguments will churn away with a whole range of assumptions about what is true or false that may or may not have anything to do with reality.

I present my opinion here. I feel it is well informed and I can back it up in various ways when challenged. But nobody "knows" anything because these aren't laws of physics or math that we're talking about here.

Or did you mean whether we know if a copy of the book is embedded in the model? That can be more objectively tested, at least.

[–] secrethat@kbin.social 3 points 1 year ago

AFAIK it takes these large bodies of text and rather than digesting them and keeping it in some sort of database, rather it holistically (and i'm generalising here), see how often certain words are strung together and taking note of that. Let's call them weights.

Then users can prompt something and the 'magic' here is that it is able to pick out words of different weights based on the prompt. Be it, are you writing an angry email to your boss, a code in python, or structure for a book.

But it is unable to recreate the book from a prompt.
People who know the topic more intimately please correct me if I am wrong .

load more comments (9 replies)
load more comments (17 replies)