this post was submitted on 24 May 2024
58 points (100.0% liked)
askchapo
22845 readers
209 users here now
Ask Hexbear is the place to ask and answer ~~thought-provoking~~ questions.
Rules:
-
Posts must ask a question.
-
If the question asked is serious, answer seriously.
-
Questions where you want to learn more about socialism are allowed, but questions in bad faith are not.
-
Try !feedback@hexbear.net if you're having questions about regarding moderation, site policy, the site itself, development, volunteering or the mod team.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The LLM is just summarizing/paraphrasing the top search results, and from these examples, doesn't seem to be doing any self-evaluation using the LLM itself. Since this is for free and they're pushing it out worldwide, I'm guessing the model they're using is very lightweight, and probably couldn't reliably evaluate results if even they prompted it to.
As for model collapse, I'd caution buying too much into model collapse theory, since the paper that demonstrated it was with a very case study (a model purely and repeatedly trained on its own uncurated outputs over multiple model "generations") that doesn't really occur in foundation model training.
I'll also note that "AI" isn't a homogenate. Generally, (transformer) models are trained at different scales, with smaller models being less capable but faster and more energy efficient, while larger flagship models are (at least, marketed as) more capable despite being slow, power- and data-hungry. Almost no models are trained in real-time "online" with direct input from users or the web, but rather with vast curated "offline" datasets by researchers/engineers. So, AI doesn't get information directly from other AIs. Rather, model-trainers would use traditional scraping tools or partner APIs to download data, do whatever data curation and filtering they do, and they then train the models. Now, the trainers may not be able to filter out AI content, or they may intentional use AI systems to generate variations on their human-curated data (synthetic data) because they believe it will improve the robustness of the model.
EDIT: Another way that models get dumber, is that when companies like OpenAI or Google debut their model, they'll show off the full-scale, instruct-finetuned foundation model. However, since these monsters are incredibly expensive, they use these foundational models to train "distilled" models. For example, if you use ChatGPT (at least before GPT-4o), then you're using either GPT3.5-Turbo (for free users), or GPT4-Turbo (for premium users). Google has recently debuted its own Gemini-Flash, which is the same concept. These distilled models are cheaper and faster, but also less capable (albeit potentially more capable than if you trained model from scratch at that reduced scale).