this post was submitted on 11 Jun 2025
129 points (99.2% liked)
chapotraphouse
13942 readers
596 users here now
Banned? DM Wmill to appeal.
No anti-nautilism posts. See: Eco-fascism Primer
Slop posts go in c/slop. Don't post low-hanging fruit here.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Of course it did. LLMs are terrible at any real tasks that involve actual reasoning and are not doable by a stochastic natural-sounding text extruding machine. Check out this study by a bunch of Apple engineers that pointed out this exact same thing: https://machinelearning.apple.com/research/illusion-of-thinking
I get how the LLM is bad at chess, I think most of everyone games of chess suck ass by definition but I'm kind of baffled about how it apparently not only played badly but wrong. How is there a big enough dataset of people yucking it up for that to happen entirely consistently?
If I say, "Knight to B4," does that sound like something a person playing chess might say? Then it did it's job.
Think of an LLM as an actor. You don't hire someone to act as a grandmaster in a movie based on their skill at chess, they might not even know how to play, but if they deliver the lines in a convincing way, that's what you're looking for. There's chess AIs that are incredibly good at chess, because that's what they're designed for and trained on. That's why this is a very silly test, it's like testing a fish on its tree-climbing ability, the only thing sillier than this test is that people are surprised by it.