Technology

59392 readers

2918 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

596

A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data (finance.yahoo.com)

submitted 1 year ago by assassin_aragorn@lemmy.world to c/technology@lemmy.world

208 comments fedilink hide all child comments

I'm rather curious to see how the EU's privacy laws are going to handle this.

(Original article is from Fortune, but Yahoo Finance doesn't have a paywall)

you are viewing a single comment's thread
view the rest of the comments

[–] Thann@lemmy.ml 18 points 1 year ago (2 children)

consent cant be revoked, theyre not even trying to get consent.

They seemingly all have a "use first then ask for forgiveness" approach which should come around to bite them in the ass

[–] Jaded@lemmy.dbzer0.com 6 points 1 year ago (2 children)

Anything else is going to bite US in the ass. Asking for consent kills any kind of open source development. It puts AI solely in the hands of like three companies. Our economy is going to be very AI focused in the future, they would literally own all of us.

You aren't getting paid either way so we might as well all enjoy the fruits of humanities labor freely instead of been forced into a subscription model of it.

[–] fushuan@lemm.ee 2 points 1 year ago (1 children)

Asking for consent doesn't kill open source development. Consent is the very reason we have licensed code. MIT, Apache, GPL3... And development is done and code is reused in accordance of those licenses.

[–] Jaded@lemmy.dbzer0.com 1 points 1 year ago

Making llms requires a stupid amount of data, much more than what is found in the creative commons. Same goes for image gen. Unless you have been accumulating data since forever through tricking people when they sign up to your website or app, you can't train anything without scraping most of the data.

It has nothing to do with licensing but the fact that there just isn't enough "free-use" data.

[–] hubobes@sh.itjust.works 2 points 1 year ago (1 children)

Except it does not?

For example: https://commonvoice.mozilla.org/

[–] Jaded@lemmy.dbzer0.com 1 points 1 year ago

"Most of the data used by large companies isn’t available to the majority of people. We think that stifles innovation."

Yes crowd sourcing is a solution but is only really possible if you are able to reach many people like Mozilla can. They only have 20k of hours up to date. Tortoise needed 50k hours and was made by one guy who open sourced it. He would not have been able to build without scraping YouTube.

Crowd sourcing also becomes much more complicated for llms or if you are making models in other language.

[–] Touching_Grass@lemmy.world 2 points 1 year ago

They shouldn't need consent unless they're reselling the works in question