this post was submitted on 17 Aug 2024
614 points (98.4% liked)

Technology

58096 readers
2943 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] wewbull@feddit.uk 5 points 1 month ago (1 children)

They don't do it because they claim that there isn't enough public domain data.... But let's be honest, nobody has tried because nobody wants a machine that isn't able to reference anything in the last 100 years.

[–] Even_Adder@lemmy.dbzer0.com 4 points 1 month ago (1 children)

You should read this letter by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries.

Why are scholars and librarians so invested in protecting the precedent that training AI LLMs on copyright-protected works is a transformative fair use? Rachael G. Samberg, Timothy Vollmer, and Samantha Teremi (of UC Berkeley Library) recently wrote that maintaining the continued treatment of training AI models as fair use is “essential to protecting research,” including non-generative, nonprofit educational research methodologies like text and data mining (TDM). If fair use rights were overridden and licenses restricted researchers to training AI on public domain works, scholars would be limited in the scope of inquiries that can be made using AI tools. Works in the public domain are not representative of the full scope of culture, and training AI on public domain works would omit studies of contemporary history, culture, and society from the scholarly record, as Authors Alliance and LCA described in a recent petition to the US Copyright Office. Hampering researchers’ ability to interrogate modern in-copyright materials through a licensing regime would mean that research is less relevant and useful to the concerns of the day.

[–] wewbull@feddit.uk 0 points 1 month ago (1 children)

I would disagree, because I don't see the research into AI as something of value to preserve.

[–] Even_Adder@lemmy.dbzer0.com 1 points 1 month ago

This isn't about research into AI, what some people want will impact all research, criticism, analysis, archiving. Please re-read the letter.