this post was submitted on 23 Jul 2023
45 points (100.0% liked)

Technology

19 readers
2 users here now

This magazine is dedicated to discussions on the latest developments, trends, and innovations in the world of technology. Whether you are a tech enthusiast, a developer, or simply curious about the latest gadgets and software, this is the place for you. Here you can share your knowledge, ask questions, and engage in discussions on topics such as artificial intelligence, robotics, cloud computing, cybersecurity, and more. From the impact of technology on society to the ethical considerations of new technologies, this category covers a wide range of topics related to technology. Join the conversation and let's explore the ever-evolving world of technology together!

founded 1 year ago
 

While looking into artificial intelligence "behavior," researchers affirmed that yes, OpenAI's GPT-4 appeared to be getting dumber.

you are viewing a single comment's thread
view the rest of the comments
[–] mal099@kbin.social 3 points 1 year ago

Damn, you're right. The study has not been peer reviewed yet according to the article, and in my opinion, it really shows. For anyone who doesn't want to actually read the study:

They took the set of questions from a different study (which is fine). The original study had a set of 500 randomly chosen prime numbers and asked ChatGPT if they were prime, and to support its reasoning. They did this to see if in the cases where ChatGPT got the question wrong, ChatGPT would try to support its wrong answer with more faulty reasoning - a dataset with only prime numbers is perfectly fine for this initial question.

The study in the article appears to be trying to answer two questions - is there significant drift in the answers ChatGPT gives, and is ChatGPT getting better or worse at answering questions. The dataset is perfectly fine for answering the first question, but completely inadequate for answering the second, since an AI that simply thinks all numbers are prime would be judged as having perfect accuracy! Some good peer review would never let that kind of thing slide.