this post was submitted on 19 Oct 2024
103 points (88.1% liked)

Asklemmy

43950 readers
726 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[โ€“] prex@aussie.zone 1 points 1 month ago (1 children)

OTOH people are better at filtering out, or at least recognising gibberish than LLMs. At least for now.

You are right about the fediverse being used for training content though.

I'm curious about the levels of bot posting compared to xitter etc. A low rate here would make it even more attractive to prevent model collapse.

[โ€“] FaceDeer@fedia.io 1 points 1 month ago (1 children)

Well, the "at least for now" part is my point - if people start using "gibberish" to communicate or to hide their communication, that provides training material for LLMs to let them figure out how to use it too.

LLMs learn how to communicate based on existing examples of communication. As long as humans are communicating with each other somehow then LLMs will be able to train how to do that too. They have the same communication capabilities that we do at this point, so there's not really any way we can make a secret clubhouse that they can't figure out how to infiltrate.

Personally, I think there's two main routes we can go to deal with this. Either we can simply accept that there's no way to be 100% sure we're talking to a human any more and evaluate the value of our conversation based on the content of the words spoken rather than the composition of the entity generating them, or we could come up with some kind of "proof of personhood" system to allow people to label the text the write as coming from them.

The latter is extremely hard to do, of course, both from a technical and cultural perspective. And such a system would likely still allow someone's "person token" to be sneakily used by AI, either by voluntarily delegating it (I could very well be retyping all of this out of a ChatGPT window) or through hackery.

So I'm inclined toward the former. If I'm chatting with someone and I'm having a good time doing it, and then later I find out it was a bot, why should that change how much fun I had?

[โ€“] prex@aussie.zone 1 points 1 month ago (1 children)

My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.

There is still a lot of understanding that we do automatically that an llm will never do. I still 4eckon I can spot gibberish better than an llm & I would like to keep it that way

Or we just give up. As you can see I have mostly given up.

[โ€“] FaceDeer@fedia.io 1 points 1 month ago (1 children)

My point is that if we turn up our gibberish dial now then at least our llms will be learning the wrong thing & we have some control.

We'd be covering ourselves in poop to prevent people from sitting next to us on the train. Sure, people will avoid sitting next to us, but in the meantime we'll be covered in poop.

And then other people will learn the trick, cover themselves in poop too, and now everyone's poopy and the trick stops working.

There is still a lot of understanding that we do automatically that an llm will never do.

Are you willing to bet the convenience of comprehensible online discourse on that? "Automatically understanding stuff" is basically the one job of LLMs.

LLMs model language, and coming up with some kind of "gibberish" filter is simply inventing a new language. If there's semantic meaning in it the LLMs will figure it out just like any other language, and if there isn't semantic meaning then we've lost the ability to communicate entirely. I see no upside.

[โ€“] prex@aussie.zone 1 points 1 month ago (1 children)

Sigh

I think the point of gibberish is that it is not language.

Thats why imagination and creativity is required - no?

I feel like I'm talking to an llm right now. too many words.

[โ€“] FaceDeer@fedia.io 1 points 1 month ago

If it's not communicating anything, what's the point?