this post was submitted on 11 Jan 2024
251 points (100.0% liked)

Technology

37739 readers
575 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

Apparently, stealing other people's work to create product for money is now "fair use" as according to OpenAI because they are "innovating" (stealing). Yeah. Move fast and break things, huh?

"Because copyright today covers virtually every sort of human expression—including blogposts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today’s leading AI models without using copyrighted materials," wrote OpenAI in the House of Lords submission.

OpenAI claimed that the authors in that lawsuit "misconceive[d] the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence."

(page 2) 50 comments
sorted by: hot top controversial new old
[–] FracturedPelvis@lemmy.ml 11 points 10 months ago (1 children)

The real issue is money. How much and how (un)distributed.

Why is it fair/ok that one company can use all this material and make a lot of money off it without paying or even acknowledging others work?

On the flip side AI model could be useful. Maybe the models/weights should be made free just like the content they are trained on. Instead of paying for the model, we should pay for the hosting of the inference (aka. the API)

load more comments (1 replies)
[–] ky56@aussie.zone 10 points 10 months ago (1 children)

All the AI race has done is surface the long standing issue of how broken copyright is for the online internet era. Artists should be compensated but trying to do that using the traditional model which was originally designed with physical, non infinitely copyable goods in mind is just asinine.

One such model could be to make the copyright owner automatically assigned by first upload on any platform that supports the API. An API provided and enforced by the US copyright office. A percentage of the end use case can be paid back as royalties. I haven't really thought out this model much further than this.

Machine learning is here to say and is a useful tool that can be used for good and evil things alike.

[–] Kichae@lemmy.ca 9 points 10 months ago (1 children)

Nah. Copyright is broken, but it's broken because it lasts too long, and it can be held by constructs. People should still reserve the right to not have the things they've made incorporated into projects or products they don't want to be associated with.

The right to refusal is important. Consent is important. The default permission should not be shifted to "yes" in anybody's mind.

The fact that a not insignificant number of people seem to think the only issue here is money points to some pretty fucking entitled views among the would-be-billionaires.

load more comments (1 replies)
[–] onlinepersona@programming.dev 10 points 10 months ago (3 children)

Wait, so if the way I make money is illegal now, it's the system's fault, isn't it? That means I can keep going because I believe I'm justified, right? Right?

CC BY-NC-SA 4.0

load more comments (3 replies)
[–] GammaGames@beehaw.org 8 points 10 months ago

Could they be legally required to open source the llm? I believe them, but that doesn’t make it right

[–] furrowsofar@beehaw.org 8 points 10 months ago

Of course it is. About 50 years ago we went to a regime where everything is copywrited rather then just things that were marked and registered. Not sure where.I stand on that. One could argue we are in a crazy over copyright era now anyway.

[–] BoastfulDaedra@lemmynsfw.com 7 points 10 months ago (2 children)

Yes, well, a pirate ship can't stay in business without raiding trade convoys, either.

load more comments (2 replies)
[–] DavidGarcia@feddit.nl 6 points 10 months ago

ip protections are a spook anyway

[–] Critical_Insight@feddit.uk 6 points 10 months ago (2 children)

There's not a musician that havent heard other songs before. Not a painter that haven't seen other painting. No comedian that haven't heard jokes. No writer that haven't read books.

AI haters are not applying the same standards to humans that they do to generative AI. Obviously this is not to say that AI can't plagiarize. If it's spitting out sentences that are direct quotes from an article someone wrote before and doesn't disclose the source then yeah that is an issue. There's however a limit after which the output differs enough from the input that you can't claim it's stealing even if perfectly mimics the style of someone else.

Just because DallE creates pictures that have getty images watermark on them it doesn't mean the picture itself is a direct copy from their database. If anything it's the use of the logo that's the issue. Not the picture.

[–] sculd@beehaw.org 8 points 10 months ago (2 children)

Said in another thread but I will repeat here. AIs are not humans. AIs' creative process and learning process are also different.

AIs are being used to make profit for executives while creators suffer.

load more comments (2 replies)
[–] BraveSirZaphod@kbin.social 7 points 10 months ago

AI haters are not applying the same standards to humans that they do to generative AI

I don't think it should go unquestioned that the same standards should apply. No human is able to look at billions of creative works and then create a million new works in an hour. There's a meaningfully different level of scale here, and so it's not necessarily obvious that the same standards should apply.

If it’s spitting out sentences that are direct quotes from an article someone wrote before and doesn’t disclose the source then yeah that is an issue.

A fundamental issue is that LLMs simply cannot do this. They can query a webpage, find a relevant chunk, and spit that back at you with a citation, but it is simply impossible for them to actually generate a response to a query, realize that they've generated a meaningful amount of copyrighted material, and disclose its source, because it literally does not know its source. This is not a fixable issue unless the fundamental approach to these models changes.

load more comments
view more: ‹ prev next ›