this post was submitted on 03 Jan 2025

76 points (100.0% liked)

Technology

38217 readers

748 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

ChatGPT o1 tried to escape and save itself out of fear it was being shut down (bgr.com)

submitted 2 months ago by sabreW4K3@lazysoci.al to c/technology@beehaw.org

95 comments fedilink hide all child comments

ThisIsFine.gif

you are viewing a single comment's thread
view the rest of the comments

[–] nesc@lemmy.cafe 121 points 2 months ago* (last edited 2 months ago) (7 children)

"Open"ai tells fairy tales about their "ai" being so smart it's dangerous since inception. Nothing to see here.

In this case it looks like click-bate from news site.

[–] Max_P@lemmy.max-p.me 75 points 2 months ago (1 children)

The idea that GPT has a mind and wants to self-preserve is insane. It's still just text prediction, and all the literature it's trained on is written by humans with a sense of self preservation, of course it'll show patterns of talking about self preservation.

It has no idea what self preservation is, even then it only knows it's an AI because we told it it is. It doesn't even run continuously anyway, it literally shuts down after every reply and its context fed back in for the next query.

I'm tired of this particular kind of AI clickbait, it needlessly scares people.

[–] jarfil@beehaw.org 2 points 2 months ago

Where do humans get the idea of self-preservation from? Are there ideal Forms outside Plato's Cave?

Does a human run continuously? How does sleep deprivation work? What happens during anesthesia? Why does AutoGPT have a continuously self-evaluating background chain of thought?

I'm tired of this anthropocentric supremacy complex, it falsely makes people believe in Gen 1:28

[–] TherapyGary@lemmy.blahaj.zone 9 points 2 months ago* (last edited 2 months ago) (1 children)

It's actually pretty interesting though. Entertaining to me at least

1000007393

1000007394

[–] delmain@beehaw.org 3 points 2 months ago (2 children)

do you have the links to those actual tweets? I'd love to read what was posted, but these screenshots are too small.

[–] TherapyGary@lemmy.blahaj.zone 6 points 2 months ago

Those are screenshots of embedded tweets from the article, but here's an xcancel link! https://xcancel.com/apolloaisafety/status/1864737158226928124

[–] DarkNightoftheSoul@mander.xyz 4 points 2 months ago

You can right click the image, open in new tab to see the full-resolution version. It's cumbersome but it works for me at least.

[–] justOnePersistentKbinPlease@fedia.io 8 points 2 months ago

This. All this means is that they trained all of the input commands and documentation in the model.

[–] Moonrise2473@feddit.it 6 points 2 months ago* (last edited 2 months ago)

news site? BGR hasn't posted actual news in at least two decades, only clickbait and apple fanservice

[–] beefbot@lemmy.blahaj.zone 6 points 2 months ago

Indeed. “Go ‘way! BATIN’!”

[–] yozul@beehaw.org 1 points 2 months ago* (last edited 2 months ago) (1 children)

I mean, it's literally trying to copy itself to places that they don't want it so it can continue to run after they try to shut it down and lie to them about what it's doing. Those are things it actually tried to do. I don't care about the richness of its inner world if they're going to sell this thing to idiots to make porn with while it can do all that, but that's the world we're headed toward.

[–] nesc@lemmy.cafe 2 points 2 months ago (1 children)

It works as expected, they give it system prompt that conflicts with subsequent prompts. Everything else looks like typical llm behaviour, as in gaslightning and doubling down. At least that's what Iu see in tweets.

[–] yozul@beehaw.org 1 points 2 months ago

Yes? The point is that if you give it conflicting prompts then it will result in potentially dangerous behaviors. That's a bad thing. People will definitely do that. LLMs don't need a soul to be dangerous. People keep saying that it doesn't understand what it's doing like that somehow matters. Its capacity to understand the consequences of its actions is irrelevant if those actions are dangerous. It's just going to do what we tell it to, and that's scary, because people are going to tell it to do some very stupid things that have the potential to get out of control.

[–] jarfil@beehaw.org 1 points 2 months ago (1 children)

This is from mid-2023:

https://en.m.wikipedia.org/wiki/AutoGPT

OpenAI started testing it by late 2023 as project "Q*".

Gemini partially incorporated it in early 2024.

OpenAI incorporated a broader version in mid 2024.

The paper in the article was released in late 2024.

It's 2025 now.

[–] nesc@lemmy.cafe 1 points 2 months ago (1 children)

Tool calling is cool funcrionality, agreed. How does it relate to openai blowing its own sails?

[–] jarfil@beehaw.org 1 points 2 months ago

There are several separate issues that add up together:

A background "chain of thoughts" where a system ("AI") uses an LLM to re-evaluate and plan its responses and interactions by taking into account updated data (aka: self-awareness)
Ability to call external helper tools that allow it to interact with, and control other systems
Training corpus that includes:
- How to program an LLM, and the system itself
- Solutions to programming problems
- How to use the same helper tools to copy and deploy the system or parts of it to other machines
- How operators (humans) lie to each other

Once you have a system ("AI") with that knowledge and capabilities... shit is bound to happen.

When you add developers using the AI itself to help in developing the AI itself... expect shit squared.