this post was submitted on 26 Aug 2024

48 points (94.4% liked)

Fuck AI

3905 readers

1362 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 2 years ago

MODERATORS

VerbFlow@lemmy.world

MrMcGasion@lemmy.world

TootSweet@lemmy.world

BigMikeInAustin@lemmy.world

cynar@lemmy.world

drmeanfeel@lemmy.world

pavnilschanda@lemmy.world

CriticalMedicine@lemmy.world

WonderfulWanderer@lemmy.world

Communist@lemmy.ml

eatCasserole@lemmy.world

SpaceNoodle@lemmy.world

NutWrench@lemmy.world

Soup@lemmy.cafe

iAvicenna@lemmy.world

Tinks@lemmy.world

wizblizz@lemmy.world

corus_kt@lemmy.world

Prandom_returns@lemm.ee

JimSamtanko@lemm.ee

TrickDacy@lemmy.world

TheFriar@lemm.ee

ArmokGoB@lemmy.dbzer0.com

HawlSera@lemm.ee

andrew_bidlaw@sh.itjust.works

MeDuViNoX@sh.itjust.works

33550336@lemmy.world

Nougat@fedia.io

Lost_My_Mind@lemmy.world

Sterile_Technique@lemmy.world

Quill7513@slrpnk.net

glowing_hans@sopuli.xyz

e8d79@discuss.tchncs.de

ThefuzzyFurryComrade@pawb.social

When A.I.’s Output Is a Threat to A.I. Itself (www.nytimes.com)

submitted 1 year ago by admin@lemmy.haley.io to c/fuck_ai@lemmy.world

4 comments fedilink hide all child comments

I ran an AI startup back in 2017 and this was a huge deal for us and I’ve seen no actual improvement in this problem. NYTimes is spot on IMO

top 4 comments

sorted by: hot top controversial new old

[–] nulluser@programming.dev 10 points 1 year ago (2 children)

This is a threat to LLMs, not AI itself. AI models looking for novel cures for diseases (for just one of many examples) are not trained on random Internet text.

[–] admin@lemmy.haley.io 6 points 1 year ago* (last edited 1 year ago) (1 children)

This is a threat to any neural network that is being constantly trained. Hell it's even a problem with our brain's NN. We just call it "believing your own bullshit" or "getting high on your own supply".

The issue with NNs looking for cures or diseases (or anything that isn't trained off of the internet) is that they are basically out of training data. They'll need orders of magnitude more to get better and we just don't have that. We haven't figured out a way to get better off of less data and there's no real movement on that front either.

What we have right now is essentially a culmination of research that has been going on since the 1960's that was finally able to be realized with us figuring out:

We can map our NN variables to a matrix
We can use linear algebra to optimize the loss on that matrix
We can leverage video cards to crunch the linear algebra
We have the largest data set ever created in order to get our loss lower than ever before

[–] nulluser@programming.dev -3 points 1 year ago* (last edited 1 year ago)

Completely irrelevant. The title and posted article are talking about unintentionally training LLM text generation models with prior output of other AI models. Not having enough training data for other types of models is a completely different problem and not what the article is about.

Nobody is going to "trawl the web for new data to train their next models” (to quote the article) for a model trying to cure diseases.

[–] FierySpectre@lemmy.world 5 points 1 year ago

AI labeled datasets have this to some extent too if they're not manually checked.