1

Intelligence explosion arguments don’t require Platonism. They just require intelligence to exist in the normal fuzzy way that all concepts exist.

1
submitted 11 months ago* (last edited 11 months ago) by sisyphean@programming.dev to c/auai@programming.dev

At OpenAI, protecting user data is fundamental to our mission. We do not train our models on inputs and outputs through our API.

1
1

We’re rolling out custom instructions to give you more control over how ChatGPT responds. Set your preferences, and ChatGPT will keep them in mind for all future conversations.

@AutoTLDR

1

GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models are updated over time is opaque. Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on four diverse tasks: 1) solving math problems, 2) answering sensitive/dangerous questions, 3) generating code and 4) visual reasoning. We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time. For example, GPT-4 (March 2023) was very good at identifying prime numbers (accuracy 97.6%) but GPT-4 (June 2023) was very poor on these same questions (accuracy 2.4%). Interestingly GPT-3.5 (June 2023) was much better than GPT-3.5 (March 2023) in this task. GPT-4 was less willing to answer sensitive questions in June than in March, and both GPT-4 and GPT-3.5 had more formatting mistakes in code generation in June than in March. Overall, our findings shows that the behavior of the “same” LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLM quality.

2
Llama 2 - Meta AI (ai.meta.com)

Introducing Llama 2 - The next generation of our open source large language model. Llama 2 is available for free for research and commercial use.

This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters.

@AutoTLDR

1

16 Mar, 2023

Kagi Search is pleased to announce the introduction of three AI features into our product offering.

We’d like to discuss how we see AI’s role in search, what are the challenges and our AI integration philosophy. Finally, we will be going over the features we are launching today.

@AutoTLDR

1
submitted 11 months ago* (last edited 11 months ago) by sisyphean@programming.dev to c/auai@programming.dev

This is a game that tests your ability to predict ("forecast") how well GPT-4 will perform at various types of questions. (In caase you've been living under a rock these last few months, GPT-4 is a state-of-the-art "AI" language model that can solve all kinds of tasks.)

Many people speak very confidently about what capabilities large language models do and do not have (and sometimes even could or could never have). I get the impression that most people who make such claims don't even know what current models can do. So: put yourself to the test.

1

Increasingly powerful AI systems are being released at an increasingly rapid pace. This week saw the debut of Claude 2, likely the second most capable AI system available to the public. The week before, Open AI released Code Interpreter, the most sophisticated mode of AI yet available. The week before that, some AIs got the ability to see images.

And yet not a single AI lab seems to have provided any user documentation. Instead, the only user guides out there appear to be Twitter influencer threads. Documentation-by-rumor is a weird choice for organizations claiming to be concerned about proper use of their technologies, but here we are.

@AutoTLDR

0

An AI-first notebook, grounded in your own documents, designed to help you gain insights faster.

@AutoTLDR

1
Claude 2 (www.anthropic.com)

We are pleased to announce Claude 2, our new model. Claude 2 has improved performance, longer responses, and can be accessed via API as well as a new public-facing beta website, claude.ai. We have heard from our users that Claude is easy to converse with, clearly explains its thinking, is less likely to produce harmful outputs, and has a longer memory. We have made improvements from our previous models on coding, math, and reasoning. For example, our latest model scored 76.5% on the multiple choice section of the Bar exam, up from 73.0% with Claude 1.3. When compared to college students applying to graduate school, Claude 2 scores above the 90th percentile on the GRE reading and writing exams, and similarly to the median applicant on quantitative reasoning.

@AutoTLDR

71

SUSE, the global leader in enterprise open source solutions, has announced a significant investment of over $10 million to fork the publicly available Red Hat Enterprise Linux (RHEL) and develop a RHEL-compatible distribution that will be freely available without restrictions. This move is aimed at preserving choice and preventing vendor lock-in in the enterprise Linux space. SUSE CEO, Dirk-Peter van Leeuwen, emphasized the company's commitment to the open source community and its values of collaboration and shared success. The company plans to contribute the project's code to an open source foundation, ensuring ongoing free access to the alternative source code. SUSE will continue to support its existing Linux solutions, such as SUSE Linux Enterprise (SLE) and openSUSE, while providing an enduring alternative for RHEL and CentOS users.

[-] sisyphean@programming.dev 17 points 1 year ago

Lol that’s like saying there’s too much porn on /r/gonewild

[-] sisyphean@programming.dev 6 points 1 year ago

Lemmy does have karma, it is stored in the DB, and the API returns it. It just isn’t displayed on the UI.

[-] sisyphean@programming.dev 7 points 1 year ago* (last edited 1 year ago)

Here people actually react to what I post and write. And they react to the best possible interpretation of what I wrote, not the worst. And even if we disagree, we can still have a nice conversation.

Does anyone have a good theory about why the threadiverse is so much friendlier? Is it only because it's smaller? Is it because of the kind of people a new platform like this attracts? Because there is no karma? Maybe something else?

[-] sisyphean@programming.dev 5 points 1 year ago

Did I miss something? Or is this still about Beehaw?

[-] sisyphean@programming.dev 7 points 1 year ago

They got gregnant

[-] sisyphean@programming.dev 3 points 1 year ago

Yeah, the situation seems pretty clear

[-] sisyphean@programming.dev 3 points 1 year ago

Can you tell us more about what they are like?

[-] sisyphean@programming.dev 4 points 1 year ago* (last edited 1 year ago)

Unfortunately the locally hosted models I've seen so far are way behind GPT-3.5. I would love to use one (though the compute costs might get pretty expensive), but the only realistic way to implement it currently is via the OpenAI API.

EDIT: there is also a 100 summaries / day limit I built into the bot to prevent becoming homeless because of a bot

[-] sisyphean@programming.dev 5 points 1 year ago

This is an excellent idea, and I'm not sure why people downvoted you. The bot library I used doesn't support requesting the user profile, but I'm sure it can be fetched directly from the API. I will look into implementing it!

[-] sisyphean@programming.dev 5 points 1 year ago* (last edited 1 year ago)

It is definitely possible, at least for videos that have a transcript. There are tools to download the transcript which can be fed into an LLM to be summarized.

I tried it here with excellent results: https://programming.dev/post/158037 - see the post description!

See also the conversation: https://chat.openai.com/share/b7d6ac4f-0756-4944-802e-7c63fbd7493f

I used GPT-4 for this post, which is miles ahead of GPT-3.5, but it would be prohibitively expensive (for me) to use it for a publicly available bot. I also asked it to generate a longer summary with subheadings instead of a TLDR.

The real question is if it is legal to programmatically download video transcripts this way. But theoretically it is entire possible, even easy.

[-] sisyphean@programming.dev 9 points 1 year ago

Trust me, the shit show is glorious. I even instinctively upvoted a couple of medieval memes but quickly realized what I was doing and closed the tab.

view more: next ›

sisyphean

joined 1 year ago
MODERATOR OF