LocalLLaMA
Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.
Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.
As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.
Rules:
Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.
Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.
Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.
Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.
view the rest of the comments
I feel we've ran into the exact same issue as before. Now we're talking property. But we were just talking about investment and we've just established those two are distinct and not the same. It's a bit confusing. And I agree, that resulting granted monopoly and rent-seeking is an intended feature, and not contributing to society. But my previous comment was addressing the aspect of the author's investment and ROI, not the resulting property from that. And that's not arbitrary at all. The author sat at his desk for 6 months specifically. Sure the resulting product is arbitrary when selling it for money, but that wasn't what we were talking about.
I don't think we're easily defrauded by the copyright industry. As I said, school-books seem like 10x cheaper here. Medication with pharma IP in it is mostly cheaper here, I have my library card for like 30€ a year?! And other than that we use the same Spotify and Netflix subscriptions for a similar price. There's no substantial difference with that. I don't see myself in a less favourable position than an US citizen. We also have access to information here, good books, podcasts, journalism, we have culture, concerts... And I don't think any of that is better or cheaper or more accessible in the US. Correct me if I'm wrong...
Yeah, some photography rules are absurd. I think it's completely mental that people do copyright infringement when they take a picture of a sculpture. Seems US Fair Use sometimes has weird quirks. We also have stupid rules for pictures in Germany.
Considering feudalism... I'd like to re-define that since wo don't have lords and a king for quite some time now. Today's land holders on the internet are companies like Meta, Google etc. They own the platforms we use on a daily basis. They make the rules, shape the place and lease chunks to us peasants as a service. We even let them shape society. For all intents and purposes, they're the feudal lords of today. And that's kind of the reason for my rejection here and why I said early on, all these AI companies are big multi-billion dollar corporations with motivations far from benefit to society. I believe concepts like Fair Use might have been invented as a means to combat feudalism. But looks to me like the situation is now changing and it's more and more used to the opposite effect by the feudal lords themselves to now contribute to their posessions, wealth and dominance.
I'll grant you the copyright industry is a worthy enemy, since they're villains, too. The copyright business model isn't healthy or beneficial to society overall. We've established that. But I really think of feudalism and a defacto-monopoly when I think of Google and Meta and OpenAI/Microsoft. And I'd really like to avoid making more concessions to my feudal lords.
Hmm. It looks like we are back to narratives again. Systematic analysis does not seem to come easy to you.
"Investment" and "rent-seeking" are concepts in economics. Like, say, "function" or "variable" are concepts in programming.
"Property" is a legal institution. It relates to "investment" a bit like a machine code instruction relates to programming. They are, sort of, the underlying facts on which higher concepts rest.
I guess you didn't get what I was trying to say. Let me put it like this:
If they wrote a story that takes place in the universe of a video game, then they need to get permission first. They need to ask whoever owns the rights to the video game, or else it is "theft".
Conversely, if the story is original, and anyone wants to make a video game in that universe, then they need the author's permission.
This remains so until 70 years after the death of the creator of the video game/story. At least, it is 70 years now. It may be made longer again at any time.
That is arbitrary, no?
Not just them, but yes. How do you think they manage that?
That seems pretty vibes-based. What do you rationally expect the outcome of your favored policies to be?
Yes. That's arbitrary. But we're conflating several very different things here. There is investment in form of labour. And I'm pretty sure we have to agree that in general, labour needs to be compensated in a capitalist economy. Then there is copyright. And this is intellectual property, which is yet another concept. All of this goes into a book, but they're all very different things. I think IP is the most abstract one (it protects concepts) and kind of moot. I'd be more lax with IP and try to allow everyone to draw a Mickey Mouse, program a Final Fantasy game or write a new Harry Potter book. Patents are a similar thing. Though we have them for a reason.
That's why I say I'm with you with the copyright and the intellectual property. But there's also work going into a book and we're always brushing over that as if it weren't a thing.
It's many factors. Timing, aggressive acquisition strategies, ecosystem building, network effects, then ecosystem lock-in, data harvesting, dominating standards, but also providing genuinely useful services. Economy of scale, massive capital... And I probably forgot dozens of factors, some legitimate, some exploitative.
Sorry, misunderstanding. I wasn't asking what you hope to happen.
You have ideas on how copyright should work wrt AI training. Make these ideas explicit, and then try to systematically analyze what the economic effects are.
Law can be a little bit like programming. A law has certain conditions. If these conditions are met, then certain legal effects follow.
If certain conditions are met, then someone has the exclusive copyright. If this copyright is violated, then damages must be paid. And of course, there are more rules to determine if copyright was violated or how those damages should be determined.
So under what conditions does AI training violate copyright? What would the legal consequence be? Then, what would that mean for the economic system on the whole?
That's a tough question. Copyright is showing its age and barely applies in the digital world. Even before AI we had a lot of edge cases and court cases over like a decade to find out how copyright applies to a digital concept. I don't think there is an easy way to retrofit something. At least I can't come up with a good idea. And the general proposal seems to be all or nothing.
What I think doesn't work is saying every normal citizen needs to buy books and Zuckerberg gets to pirate books. In a democracy law has to apply to everyone. And his use-case doesn't matter here. I can also claim I pirated the 10TB of TV shows and movies for transformative or legitimate use. It's still piracy. And other law works the same way. If I steal chocolate in the supermarket, that's also theft no matter what I was planning to do with it. So that's out.
And then we're left with how economy is supposed to work as of today. An AI company needs supplies to manufacture their product, they buy those supplies on the market... In this case that's going to be licensing content. Though, that's going to be hard. A billion dollar company with a service used by millions of people should pay more than a single researcher doing it for 5 people. And implementing that would be impossibly complex. One possible way would be to introduce a collecting society to handle the money and maths. But they're not ideal either.
So it's more or less down to allowing AI companies to use content with some kind of default license. They can take all the public information as they wish. Again, they can not steal in the process. They'll buy one copy of a Terry Pratchett novel at the same price everyone needs to pay.
And to compensate for them not having to contract with the authors an buy special licenses, they need to offer transparency. Tell the authors and everyone what went into the models and if their content is amongst that. And if they scraped my personal data, I need a way to get that deleted from the dataset.
I'd also add an optional opt-out mechanism to appease to the people who hate AI. They can add some machine-readable notice, or file a complaint and their content will be discarded.
And since just taking and not contributing back isn't healthy to society, I'd add something about "composite" works. If something like an AI model is just pieced together by other people's content, that doesn't deserve copyright in my opinion. So all generations are automatically public domain and maybe the models as well.
And we need a definition of AI and transformative. Once we get capable models with a ability to recite an entire novel word by word, that's going to run into copyright again. So yeah.
And intellectual property has to be softened. A generative AI model necessary "contains" a lot of IP, has knowledge about it and can reproduce it. And we need to be alright with that. And in case someone wants to outlaw impersonation and celebrity deepfakes, there needs to be more than a blurry line.
But all of this is more patching copyright and we're going to run into all kinds of issues with that. I think ideally we come up with a grand idea and overhaul the entire thing so it applies to the 21st century.