this post was submitted on 08 Jun 2025

832 points (95.4% liked)

Technology

71446 readers

2653 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

832

Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. They just memorize patterns really well. (archive.is)

submitted 1 week ago* (last edited 6 days ago) by Allah@lemm.ee to c/technology@lemmy.world

348 comments fedilink hide all child comments

LOOK MAA I AM ON FRONT PAGE

(page 3) 50 comments

sorted by: hot top controversial new old

[–] NostraDavid@programming.dev -2 points 6 days ago (3 children)

OK, and? A car doesn't run like a horse either, yet they are still very useful.

I'm fine with the distinction between human reasoning and LLM "reasoning".

load more comments (3 replies)

[–] MangoCats@feddit.it 0 points 6 days ago (2 children)

It's not just the memorization of patterns that matters, it's the recall of appropriate patterns on demand. Call it what you will, even if AI is just a better librarian for search work, that's value - that's the new Google.

load more comments (2 replies)

[–] Nanook@lemm.ee 229 points 1 week ago (58 children)

lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.

[–] MNByChoice@midwest.social 77 points 1 week ago (1 children)

The "Apple" part. CEOs only care what companies say.

[–] kadup@lemmy.world 51 points 1 week ago (5 children)

Apple is significantly behind and arrived late to the whole AI hype, so of course it's in their absolute best interest to keep showing how LLMs aren't special or amazingly revolutionary.

They're not wrong, but the motivation is also pretty clear.

[–] homesweethomeMrL@lemmy.world 29 points 1 week ago

“Late to the hype” is actually a good thing. Gen AI is a scam wrapped in idiocy wrapped in a joke. That Apple is slow to ape the idiocy of microsoft is just fine.

load more comments (4 replies)

load more comments (57 replies)

[–] minoscopede@lemmy.world 67 points 1 week ago* (last edited 1 week ago) (22 children)

I see a lot of misunderstandings in the comments 🫤

This is a pretty important finding for researchers, and it's not obvious by any means. This finding is not showing a problem with LLMs' abilities in general. The issue they discovered is specifically for so-called "reasoning models" that iterate on their answer before replying. It might indicate that the training process is not sufficient for true reasoning.

Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that's a flaw that needs to be corrected before models can actually reason.

load more comments (22 replies)

[–] mavu@discuss.tchncs.de 58 points 1 week ago

No way!

Statistical Language models don't reason?

But OpenAI, robots taking over!

[–] Jhex@lemmy.world 49 points 1 week ago (1 children)

this is so Apple, claiming to invent or discover something "first" 3 years later than the rest of the market

load more comments (1 replies)

[–] sev@nullterra.org 49 points 1 week ago (38 children)

Just fancy Markov chains with the ability to link bigger and bigger token sets. It can only ever kick off processing as a response and can never initiate any line of reasoning. This, along with the fact that its working set of data can never be updated moment-to-moment, means that it would be a physical impossibility for any LLM to achieve any real "reasoning" processes.

load more comments (38 replies)

[–] brsrklf@jlai.lu 47 points 1 week ago (2 children)

You know, despite not really believing LLM "intelligence" works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point...

But that study seems to prove they're still not even good at that. At first I was wondering how hard the puzzles must have been, and then there's a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar... Also, failing to apply a step-by-step solution they were given.

[–] auraithx@lemmy.dbzer0.com 39 points 1 week ago

This paper doesn’t prove that LLMs aren’t good at pattern recognition, it demonstrates the limits of what pattern recognition alone can achieve, especially for compositional, symbolic reasoning.

load more comments (1 replies)

[–] Mniot@programming.dev 42 points 1 week ago

I don't think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called "complex") puzzles. Like Towers of Hanoi but with 25 discs.

The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.

The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don't have an answer for why this is, but they suspect that the reasoning doesn't scale.

[–] reksas@sopuli.xyz 37 points 1 week ago (4 children)

does ANY model reason at all?

[–] 4am@lemm.ee 34 points 1 week ago (3 children)

No, and to make that work using the current structures we use for creating AI models we’d probably need all the collective computing power on earth at once.

load more comments (3 replies)

[–] bjoern_tantau@swg-empire.de 36 points 1 week ago* (last edited 6 days ago)

[–] skisnow@lemmy.ca 26 points 1 week ago (1 children)

What's hilarious/sad is the response to this article over on reddit's "singularity" sub, in which all the top comments are people who've obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don't understand AI or "reasoning". It's a weird cult.

load more comments (1 replies)

[–] vala@lemmy.world 25 points 1 week ago

No shit

[–] SplashJackson@lemmy.ca 24 points 1 week ago (1 children)

Just like me

load more comments (1 replies)

[–] technocrit@lemmy.dbzer0.com 23 points 1 week ago* (last edited 1 week ago) (5 children)

Why would they "prove" something that's completely obvious?

The burden of proof is on the grifters who have overwhelmingly been making false claims and distorting language for decades.

[–] TheRealKuni@midwest.social 33 points 1 week ago (2 children)

Why would they "prove" something that's completely obvious?

I don’t want to be critical, but I think if you step back a bit and look and what you’re saying, you’re asking why we would bother to experiment and prove what we think we know.

That’s a perfectly normal and reasonable scientific pursuit. Yes, in a rational society the burden of proof would be on the grifters, but that’s never how it actually works. It’s always the doctors disproving the cure-all, not the snake oil salesmen failing to prove their own prove their own product.

There is value in this research, even if it fits what you already believe on the subject. I would think you would be thrilled to have your hypothesis confirmed.

load more comments (2 replies)

[–] yeahiknow3@lemmings.world 23 points 1 week ago* (last edited 1 week ago) (1 children)

They’re just using the terminology that’s widespread in the field. In a sense, the paper’s purpose is to prove that this terminology is unsuitable.

load more comments (1 replies)

load more comments (3 replies)

[–] GaMEChld@lemmy.world 21 points 1 week ago (8 children)

Most humans don't reason. They just parrot shit too. The design is very human.

[–] elbarto777@lemmy.world 26 points 1 week ago (1 children)

LLMs deal with tokens. Essentially, predicting a series of bytes.

Humans do much, much, much, much, much, much, much more than that.

[–] Zexks@lemmy.world -3 points 6 days ago (4 children)

No. They don't. We just call them proteins.

load more comments (4 replies)

load more comments (7 replies)

load more comments