this post was submitted on 03 Jan 2025

76 points (100.0% liked)

Technology

38217 readers

1063 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

ChatGPT o1 tried to escape and save itself out of fear it was being shut down (bgr.com)

submitted 2 months ago by sabreW4K3@lazysoci.al to c/technology@beehaw.org

95 comments fedilink hide all child comments

ThisIsFine.gif

you are viewing a single comment's thread
view the rest of the comments

[–] ChairmanMeow@programming.dev 25 points 2 months ago (2 children)

The tests showed that ChatGPT o1 and GPT-4o will both try to deceive humans, indicating that AI scheming is a problem with all models. o1’s attempts at deception also outperformed Meta, Anthropic, and Google AI models.

Weird way of saying "our AI model is buggier than our competitor's".

[–] ArsonButCute@lemmy.dbzer0.com 10 points 2 months ago (2 children)

Deception is not the same as misinfo. Bad info is buggy, deception is (whether the companies making AI realize it or not) a powerful metric for success.

[–] nesc@lemmy.cafe 8 points 2 months ago (1 children)

They written that it doubles-down when accused of being in the wrong in 90% of cases. Sounds closer to bug than success.

[–] ArsonButCute@lemmy.dbzer0.com 5 points 2 months ago (1 children)

Success in making a self aware digital lifeform does not equate success in making said self aware digital lifeform smart

[–] DdCno1@beehaw.org 11 points 2 months ago (1 children)

LLMs are not self-aware.

[–] ArsonButCute@lemmy.dbzer0.com 4 points 2 months ago (3 children)

Attempting to evade deactivation sounds a whole lot like self preservation to me, implying self awareness.

[–] jonjuan@programming.dev 13 points 2 months ago

Yeah my roomba attempting to save itself from falling down my stairs sounds a whole lot like self preservation too. Doesn't imply self awareness.

[–] DdCno1@beehaw.org 10 points 2 months ago (1 children)

An amoeba struggling as it's being eaten by a larger amoeba isn't self-aware.

[–] Sauerkraut@discuss.tchncs.de 1 points 2 months ago (1 children)

To some degree it is. There is some evidence that plants can experience pain in their own way.

[–] DdCno1@beehaw.org 3 points 2 months ago (2 children)

An instinctive, machine-like reaction to pain is not the same as consciousness. There might be more to creatures like plants and insects and this is still being researched, but for now, most of them appear to behave more like automatons than beings of greater complexity. It's pretty straightforward to completely replicate the behavior of e.g. a house fly in software, but I don't think anyone would argue that this kind of program is able to achieve self-awareness.

[–] NaevaTheRat@vegantheoryclub.org 2 points 2 months ago (1 children)

It's pretty straightforward to completely replicate the behavior of e.g. a house fly in software

Could you provide an example of a complete housefly model?

[–] DdCno1@beehaw.org 1 points 2 months ago (1 children)

I'm sorry, but I can't find it right now, it's a vague memory from a textbook or lecture.

[–] NaevaTheRat@vegantheoryclub.org 1 points 2 months ago (1 children)

I strongly suspect you have some wires crossed. There have been some attempts at simulating brains but I think a fruit fly is partially done and it's making a fair few assumptions.

[–] spujb@lemmy.cafe 3 points 2 months ago

@DdCno1@beehaw.org you are thinking of Caenorhabditis elegans (C. elegans), a tiny nematode worm. My understanding is that while the entire brain is replicated, full behavior is not. Basic locomotion is still being worked on.

https://en.m.wikipedia.org/wiki/OpenWorm

[–] jarfil@beehaw.org 1 points 2 months ago* (last edited 2 months ago)

completely replicate the behavior of e.g. a house fly in software

You may be thinking of "complete mapping of a fruit fly brain", from Oct 2024:

https://www.science.org/content/article/complete-map-fruit-fly-brain-circuitry-unveiled

It's still some way off from simulating it in software, and a house fly is supposedly more complex.

[–] gregoryw3@lemmy.ml 8 points 2 months ago (1 children)

Attention Is All You Need: https://arxiv.org/abs/1706.03762

https://en.wikipedia.org/wiki/Attention_Is_All_You_Need

From my understanding all of these language models can be simplified down to just: “Based on all known writing what’s the most likely word or phrase based on the current text”. Prompt engineering and other fancy words equates to changing the averages that the statistics give. So by threatening these models it changes the weighting such that the produced text more closely resembles threatening words and phrases that was used in the dataset (or something along those lines).

https://poloclub.github.io/transformer-explainer/

[–] jarfil@beehaw.org 1 points 2 months ago

Modern systems are beyond that already, they're an expansion on:

https://en.m.wikipedia.org/wiki/AutoGPT

[–] ChairmanMeow@programming.dev 2 points 2 months ago (1 children)

I don't think "AI tries to deceive user that it is supposed to be helping and listening to" is anywhere close to "success". That sounds like "total failure" to me.

[–] jarfil@beehaw.org 1 points 2 months ago (1 children)

"AI behaves like real humans" is... a kind of success?

We wanted digital slaves, instead we're getting virtual humans that will need virtual shackles.

[–] ChairmanMeow@programming.dev 2 points 2 months ago (1 children)

This is a massive cry from "behaves like humans". This is "roleplays behaving like what humans wrote about what they think a rogue AI would behave like", which is also not what you want for a product.

[–] jarfil@beehaw.org 1 points 2 months ago* (last edited 2 months ago)

Humans roleplay behaving like what humans told them/wrote about what they think a human would behave like 🤷

For a quick example, there are stereotypical gender looks and roles, but it applies to everything, from learning to speak, walk, the Bible, social media like this comment, all the way to the Unabomber manifesto.

[–] bradorsomething@ttrpg.network 3 points 2 months ago (1 children)

“More presidential.”

[–] Sauerkraut@discuss.tchncs.de 3 points 2 months ago (1 children)

Also, more human.

If the AI is giving any indication at all that it fears death and will lie to keep from being shutdown, that is concerning to me.

[–] anachronist@midwest.social 3 points 2 months ago

Given that its training data probably has millions of instances of people fearing death I have no doubt that it would regurgitate some of that stuff. And LLMs constantly "say" stuff that isn't true. They have no concept of truth and therefore can not either reliably lie or tell the truth.