this post was submitted on 11 Jul 2025
103 points (98.1% liked)

Fuck AI

3492 readers
577 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 1 year ago
MODERATORS
 

There have been multiple things which have gone wrong with AI for me but these two pushed me over the brink. This is mainly about LLMs but other AI has also not been particularly helpful for me.

Case 1

I was trying to find the music video from where a screenshot was taken.

I provided o4 mini the image and asked it where it is from. It rejected it saying that it does not discuss private details. Fair enough. I told it that it is xyz artist. It then listed three of their popular music videos, neither of which was the correct answer to my question.

Then I started a new chat and described in detail what the screenshot was. It once again regurgitated similar things.

I gave up. I did a simple reverse image search and found the answer in 30 seconds.

Case 2

I wanted a way to create a spreadsheet for tracking investments which had xyz columns.

It did give me the correct columns and rows but the formulae for calculations were off. They were almost correct most of the time but almost correct is useless when working with money.

I gave up. I manually made the spreadsheet with all the required details.

Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources? I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.

top 50 comments
sorted by: hot top controversial new old
[–] RvTV95XBeo@sh.itjust.works 19 points 5 days ago (1 children)

LLMs are not designed to give you objective factual answers. They're designed to guess what you want to hear, like a middle school student writing a book report for a book they never read.

[–] Outwit1294 1 points 4 days ago (1 children)

I don’t think it considers what the user wants to hear. It is concerned about what the data it has trained on would consider a logical answer.

[–] RvTV95XBeo@sh.itjust.works 1 points 4 days ago

What the user wants to hear is usually biased in the question. "Why are vaccines good" will have a different response from "Why are vaccines bad"

Both may or may not include factual information (again, middle school student guessing at a reading assignment analogy), but they're shaped by the questioner to reaffirm your own biases.

[–] vrighter@discuss.tchncs.de 9 points 5 days ago (1 children)

no, they aren't processing high quality data from multiple sources. They're giving you a statistical average of that data. They will always be wrong by nature. Hallucinations cannot be eliminated. Anyone saying otherwise (irrelevant of how rich they are) is bullshitting.

[–] Outwit1294 1 points 5 days ago (2 children)

If hallucinations cannot be eliminated, how are they decreasing them (allegedly)?

[–] ZDL@lazysoci.al 3 points 5 days ago

Actually according to studies, the most recent versions of all the major LLMbecile vendors are hallucinating more, not less.

[–] vrighter@discuss.tchncs.de 1 points 5 days ago (1 children)

by special casing a lot of things. Like expert systems, in the 80s

[–] Outwit1294 1 points 5 days ago (1 children)
[–] vrighter@discuss.tchncs.de 3 points 5 days ago (1 children)

the "guardrails" they mention. They are a bunch of if/then statements looking to work around methods that the developers have found to produce undesirable outputs. It doesn't ever mean "the llm will not bo doing this again". It means "the llm wont do this when it is asked in this particular way", which always leaves the path open for "jailbreaking". Because you will almost always be able to ask a differnt way that the devs (of the guardrails, they don't have much control over the llm itself) did not anticipate.

Expert systems were kind of "if we keep adding if/then statements, we would eventually cover all the bases and get a smart, reliable system". That didn't work then. It won't work now either

[–] Outwit1294 1 points 4 days ago

I have experienced this first hand. Asking LLMs explicit things leads to “I can’t help you with that” but if I ask it in a roundabout way, it gives a straight answer.

[–] lowered_lifted@lemmy.blahaj.zone 7 points 5 days ago (1 children)

it's by design. They are literally just guessing at what part of their database should be put in next, based on the next most likely word. There is no real point to them, because they cannot know things and they are not intelligent. Check out the works of Timnit Gebru if you'd like to know more.

[–] Outwit1294 1 points 5 days ago

What is they saying about AGI?

[–] stabby_cicada@slrpnk.net 16 points 6 days ago (2 children)

Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources?

Well that's the thing. LLMs don't generally "process" data as humans would. They don't understand the text they're generating. So they can't check their answers against reality.

(Except for Grok 4, but it's apparently checking its answers to make sure they agree with Elon Musk's Tweets, which is kind of the opposite of accuracy.)

I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.

As someone who lived through the dotcom boom of the 2000s, and the crypto booms of 2017 and 2021, the AI boom is pretty obviously yet another fad. The point is to make money - from both consumers and investors - and AI is the new buzzword to bring those dollars in.

[–] ChapulinColorado@lemmy.world 8 points 6 days ago

Don’t forget IoT, where the S stands for security! Or “The Cloud”! Make sure to rebuy the junk we will deprecate in 2 years time because we love electronic waste and planned obsolescence ;)

[–] Outwit1294 5 points 6 days ago (2 children)

AI is definitely a bubble and it is going to crash the stock market one day, along with bitcoin

load more comments (2 replies)
[–] WolfLink@sh.itjust.works 11 points 6 days ago (1 children)

LLMs are curve fitting the function of “input text” to “expected output text”.

So when you give it an input text, it generates an output text interpolated from the expected outputs for similar inputs.

That means it’s often right for very common prompts and often wrong for prompts that are subtly different from common prompts.

load more comments (1 replies)
[–] Sagan_Wept@lemmynsfw.com 7 points 5 days ago

Case1 isn't a good use case of AI, Case 2 you're going to want a higher quality model than o4. 4.1 is better at math and analysis, claude 4 is probably more accurate at this use case

[–] ZDL@lazysoci.al 10 points 6 days ago

I was thinking about the question here and how to reframe it so that it answers itself. I think I have the right way to ask the question:

Why is a hyper-advanced game of mad libs so wrong all the time?

That would get across the point, I think.

[–] belit_deg@lemmy.world 4 points 5 days ago (1 children)

I highly recommend modern day oracles or bullshit machines, two professors explain it beautifully

[–] Outwit1294 2 points 4 days ago (1 children)

Bookmarked for watching/reading this week. Will let you know my thoughts.

[–] belit_deg@lemmy.world 1 points 4 days ago

Cool, enjoy!

[–] hedgehog@ttrpg.network 6 points 6 days ago (3 children)

LLM image processing doesn’t work the same way reverse image lookup does.

Tldr explanation: Multimodal LLMs turn pictures into a ~~thousand~~ 200-500 or so ~~words~~ tokens, but reverse image lookups create perceptual hashes of images and look the hash of your uploaded image up in a database.

Much longer explanation:

Multimodal LLMs (technically, LMMs - large multimodal models) use vision transformers to turn images into tokens. They use tokens for words, too, but these tokens don’t also correspond to words. There are multiple ways this could be implemented, but a common approach is to break the image down into a grid, then transform each “patch” of a specific size, e.g., 16x16, into a single token. The patches aren’t transformed individually - the whole image is processed together, in context - but it still comes out of it with basically 200 or so tokens that allow it to respond to the image, the same way it would respond to text.

Current vision transformers also struggle with spatial awareness. They embed basic positional data into the tokens but it’s fragile and unsophisticated when it comes to spatial awareness. Fortunately there’s a lot to explore in that area so I’m sure there will continue to be improvements.

One example improvement, beyond improved spatial embeddings, would be to use a dynamic vision transformers that’s dependent on the context, or that can re-evaluate an image based off new information. Outside the use of vision transformers, simply training LMMs to use other tools on images when appropriate can potentially help with many of LMM image processing’s current shortcomings.

Given all that, asking an LLM to find the album for you is like - assuming you’ve given it the ability and permission to search the web - like showing the image to someone with no context, then them to help you find what music video - that they’ve never seen, by an artist whose appearance they describe with 10-20 generic words, none of which are their name - it’s in, and to hope there were, and that they remembered, the specific details that would make it would come up in the top ten results if searched for on Google. That’s a convoluted way to say that it’s a hard task.

By contrast, reverse image lookup basically uses a perceptual hash generated for each image. It’s the tool that should be used for your particular problem, because it’s well suited for it. LLMs were the hammer and this problem was a torx screw.

Suggesting you use - or better, using a reverse image lookup tool itself - is what the LLM should do in this instance. But it would need to have been trained to think to suggest this, capable of using a tool that could do the lookup, and have both access and permission to do the lookup.

Here’s a paper that might help understand the gaps between LMMs and tasks built for that specific purpose: https://arxiv.org/html/2305.07895v7

load more comments (3 replies)
[–] thegr8goldfish@startrek.website 8 points 6 days ago (1 children)

I like to think of them as artificial con men. They sound great. They have confidence and are complimentary and are very agreeable, but they will tell you what they think you want to hear. Whether or not what they are telling you is truthful isn't even part of the equation.

load more comments (1 replies)
[–] shalafi@lemmy.world 7 points 6 days ago (1 children)

I almost always get perfect responses, but I'm very limited in what I'll input. Often I'm just using ChatGPT to remember a word or event I've forgotten. Pretty much 100% accurate on that bit.

Couldn't explain how I know what will and won't work, but I have a sense of it. Also, the farther you drill into a thing, the more off-topic it gets. I'm almost always one and done with a prompt.

load more comments (1 replies)
[–] AA5B@lemmy.world 2 points 5 days ago (1 children)

It did give me the correct columns and rows but the formulae for calculations were off.

Did you tell it that? Assuming you were using an AI chat, you have the opportunity to provide additional info and have it try again.

Getting better success from LLM is a process of providing more context and refining things over iterations

For example I wanted it to generate a python data structure for me, along with lookup functions to cross reference the data. However I gave it further info about the data structures, the cross-mapping and how I wanted it normalized, and iterated a few times until I got something worth copy-pasting sections

[–] Outwit1294 2 points 5 days ago

I did. It did not help

[–] leftzero@lemmynsfw.com 3 points 6 days ago

The thing about LLMs is that they "store" information about the shape of their training models, not about the information contained therein. That information is lost.

A LLM will produce text that looks like the texts it was trained with, but it only can only reproduce any information contained in them if it's common enough in its training data to statistically affect their shape, and even then it has a chance to get it wrong, since it has no way to check its output for fact accuracy.

Add to that that most models are pre-prompted to sound confident, helpful, and subservient (the companies' main goal not being to provide information, but to get their customers hooked on their product and coming back for more), and you get the perfect scammers and yes-men. Auto-complete mentalists that will give you as much confident sounding information shaped nonsense as you want, doing their best to agree with you and confirm any biases you might have, with complete disregard for accuracy, truth, or the effects your trust in their output might have (which makes them extremely dangerous and addictive for suggestible or intellectually or emotionally vulnerable users).

load more comments
view more: next ›