this post was submitted on 27 May 2024
1102 points (98.0% liked)
Technology
59696 readers
5186 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
no, the truth is it's impossible even then. If the result involves randomness at its most fundamental level, then it's not reliable whatever you do.
Sure, the AI is never going to understand what it's doing or why, but training it on better datasets certain WILL improve the results.
Garbage in, garbage out.
You can train an LLM on the best possible set of data without a single false statement and it will still hallucinate. And there’s nothing to be done against that.
Without understanding of the context everything can be true or false.
“The acceleration due to gravity is equal to 9.81m/s2” True or False?
LLM basically works like this: given the previous words written and their order, the most probable next word of the sentence is this one.
Well yes, I've seen those examples of ChatGPT citing scientific research papers that turned out to be completely made up, but at least it seems to be a step up from straight up shitposting, which is what you get when you train it on a dataset full of shitposts.
Well it’s definitely true that you will have hard times getting true things from garbage. But funny enough, the model might hallucinate true things:)