79
submitted 9 months ago by JRepin@lemmy.ml to c/technology@lemmy.ml

This is a classic case of tragedy of the commons, where a common resource is harmed by the profit interests of individuals. The traditional example of this is a public field that cattle can graze upon. Without any limits, individual cattle owners have an incentive to overgraze the land, destroying its value to everybody.

We have commons on the internet, too. Despite all of its toxic corners, it is still full of vibrant portions that serve the public good — places like Wikipedia and Reddit forums, where volunteers often share knowledge in good faith and work hard to keep bad actors at bay.

But these commons are now being overgrazed by rapacious tech companies that seek to feed all of the human wisdom, expertise, humor, anecdotes and advice they find in these places into their for-profit A.I. systems.

top 9 comments
sorted by: hot top controversial new old
[-] hypna@lemmy.world 19 points 9 months ago

A truly poor analogy. LLMs don't remove anything from anywhere. They consume no shared resource.

It's been wild watching people flail about searching for arguments for why LLMs should be stopped. I'm not even saying they shouldn't, just that I haven't seen a solid argument for it.

[-] Spzi@lemm.ee 13 points 9 months ago

As per the article, it goes like this:

  1. AI is trained on publicly available data
  2. AI does not credit or compensate original authors
  3. People don't like their work being used without
  4. People share less publicly
  5. Public spaces desert

And simultaneously, AI content of poor quality drowns what is left.

In terms of arguments, have you heard about control / alignment problem or x-risk?

[-] Bye@lemmy.world 8 points 9 months ago

Isn’t that true with people too? If I read a bunch of books and then use what I learned to write a new book, I’m not crediting the original authors. If I learn painting techniques from Van Gogh and el Greco, I’m not crediting them either.

[-] Madison_rogue@kbin.social 4 points 9 months ago* (last edited 9 months ago)

You're equating sentience with non-sentience. a LLM is a non-sentient program, created by humans to learn language. You are a sentient person who is influenced by the painting techniques of Van Gogh and el Greco. While you don't need to credit them, they have influenced your work. That is entirely acceptable practice.

This is a huge difference in the realm of copyright.

EDIT

Also the works of the artists you mention are in public domain in most countries. They can be used by LLM without incident. Works of artists not in the public domain should be subject to copyright law for LLM.

[-] lvxferre@lemmy.ml 18 points 9 months ago

But these commons are now being overgrazed by rapacious tech companies that seek to feed all of the human wisdom, expertise, humor, anecdotes and advice they find in these places into their for-profit A.I. systems.

I think that the concept of tragedy of the commons is being misused here. When feeding data into those models, there's no common resource being used, as the data doesn't cease to exist once you feed it to your L"L"M. Instead what's happening is that they're further breaking what was already broken - the legal concepts of IP and copyright.

Where the concept could apply is the usage of the output of those models, with the common resource being the overall quality, reliability, and usefulness of the internet, for the sake of petty benefits (such as advertisement/spamming/marketing). However this degradation predates the large "language" models (and the internet itself), and it isn't a result of the technology itself.

[-] Icaria@lemmy.world 2 points 9 months ago

This was my first thought as well. My second thought is all the harms we've caused to ourselves in the digital age, and we only start to care when it hits our pocketbooks.

[-] CommieCretzl@hexbear.net 11 points 9 months ago

Again?! Damn I guess it wasn't bad enough already

[-] autotldr@lemmings.world 8 points 9 months ago

This is the best summary I could come up with:


Thanks to artificial intelligence, however, IBM was able to sell Mr. Marston’s decades-old sample to websites that are using it to build a synthetic voice that could say anything.

A.I.-generated books — including a mushroom foraging guide that could lead to mistakes in identifying highly poisonous fungi — are so prevalent on Amazon that the company is asking authors who self-publish on its Kindle platform to also declare if they are using A.I.

But these commons are now being overgrazed by rapacious tech companies that seek to feed all of the human wisdom, expertise, humor, anecdotes and advice they find in these places into their for-profit A.I.

Consider, for instance, that the volunteers who build and maintain Wikipedia trusted that their work would be used according to the terms of their site, which requires attribution.

A Washington Post investigation revealed that OpenAI’s ChatGPT relies on data scraped without consent from hundreds of thousands of websites.

Whether we are professional actors or we just post pictures on social media, everyone should have the right to meaningful consent on whether we want our online lives fed into the giant A.I.


The original article contains 1,094 words, the summary contains 188 words. Saved 83%. I'm a bot and I'm open source!

[-] narwhal@lemmy.ml 4 points 9 months ago
this post was submitted on 23 Sep 2023
79 points (88.3% liked)

Technology

33638 readers
213 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS