this post was submitted on 04 Aug 2023

72 points (93.9% liked)

Technology

39340 readers

296 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 6 years ago

MODERATORS

MinutePhrase@lemmy.ml

Death by LLM: Stack Overflow's decline, and its plan to survive, shows the future of free online data in an AI world (www.businessinsider.com)

submitted 2 years ago by floofloof@lemmy.ca to c/technology@lemmy.ml

12 comments fedilink hide all child comments

top 12 comments

sorted by: hot top controversial new old

[–] eager_eagle@lemmy.world 26 points 2 years ago* (last edited 2 years ago) (2 children)

I like the experience using Copilot and GPT much better than browsing SO, but this is what worries me in the long-term though:

This issue goes beyond the survival of Stack Overflow. All AI models need a steady flow of quality human data to train on. Without that, they'll be left to rely on machine-generated content, and researchers have found that this leads to worse performance. There's an ominous name for this: model collapse.

Without this incredible knowledge sharing and curated feedback, in an environment that constantly changes with new libraries, languages, and best practices, these LLMs are doomed. I think solving this might be Stack Overflow's way out.

[–] r00ty@kbin.life 6 points 2 years ago (2 children)

Yes, and businesses thinking they can drop their developers for chatgpt like tech in the future should (they won't, but they should) consider this. AI goes to pot very quickly without human input.

[–] lemmyvore@feddit.nl 5 points 2 years ago* (last edited 2 years ago)

It's not even on the map. Most of the businesses who think they can replace anybody with LLMs are thinking about subscribing to a LLM that's been trained and maintained by someone else. Which of course involved giving that someone the upper hand and letting them dictate terms.

Anybody who tried making their own model knows it's tough, grueling work.

So these businesses take the easy way out and will give that someone their data (and break privacy and regulations in the process) and use the data that comes out of the LLM with no regard whatsoever about where and how it's been sourced and what legal implications that might have for themselves.

If you add the fact the LLM owner usually makes you sign a contract that gives basically no guarantees, you have the recipe for a very fine mess.

I still can't wrap my head around for example how can any software company let or even goad its programmers to use Copilot in good faith, with no idea where the code is coming from and what's the copyright status. Leaving aside the fact Microsoft is currently being sued for this exact problem.

[–] EatMyDick@lemmy.world -5 points 2 years ago

Guys stop this non-sense. That's not how it works.

They hire less new developers. There will be less people doing the work. Idiots who don't learn to use the tech will be left behind. This is already happening.

[–] dbilitated@aussie.zone 3 points 2 years ago (2 children)

if we use embedding and the language documentation, I wonder how much it can work out going forward?

[–] bionicjoey@lemmy.ca 2 points 2 years ago

Nothing because language models don't understand the text they read.

[–] eager_eagle@lemmy.world 1 points 2 years ago* (last edited 2 years ago)

From what we see today based on these LLMs that are given a larger context (e.g. internal documentation or knowledge bases), we can say that it'd be as good as a decent developer that reads said documentation and it's able to apply that knowledge to a specific use case.

But Stack Overflow answers often target things that don't come up in the docs, that are outdated, or somewhat case-dependent and/or opinionated. Answers that might even lead to changes in documentation. This kind of insight will be hampered over time without a way of continuously sharing such knowledge.

[–] itchy_lizard@feddit.it 18 points 2 years ago* (last edited 2 years ago) (3 children)

God the narrative of Business Insider is gross.

The only thing making SO decline is that they have a CEO. And that CEO is trying to "compete".

Just keep being a great platform for Q&A and stop chasing profits. People prefer SO because the ansewrds are trustworthy. LLMs will always bullshit you and never be better than a platform free of AI crap.

[–] Klame@lemmy.ml 10 points 2 years ago

Also, LLMs are trained on SO data. It remains a staple for coding, LLMs just reinforced that.

[–] floofloof@lemmy.ca 3 points 2 years ago* (last edited 2 years ago)

The decline has accelerated since the release of ChatGPT, which suggests there may be a connection, especially given ChatGPT's ability to answer many coding questions.

Stack Overflow posts, 2018-23:

Stack Overflow's decline in posts accelerates in 2023

Source

ChatGPT traffic, 2022-23:

ChatGPT's rise in traffic in 2023

Source

[–] mnemonicmonkeys@sh.itjust.works 2 points 2 years ago

I agree. That being said, there is some majorly bad answers on stack overflow. 9 times out of 10 I get wrong answers, and one time I was looking for a solution in Arduino and someone answered in Javascript for some reason.

[–] recycledbits@discuss.tchncs.de 9 points 2 years ago

SO's attempts at bolting some kind of AI into their site have been a great source of entertainment:

https://meta.stackoverflow.com/questions/425162/we-are-seeking-functional-feedback-for-the-formatting-assistant