this post was submitted on 21 Aug 2023
641 points (95.5% liked)
Technology
59197 readers
3404 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I think excluding all AI creations from copyright might be one part of a good solution to all this. But you’re right that something has to be done at the point of scraping and training. Perhaps training should be considered outside of fair use and a copyright violation (without permission).
This would make obtaining training data extremely expensive. That effectively makes AI research impossible for individuals, educational institutions, and small players. Only tech giants would have the kind of resources necessary to generate or obtain training data. This is already a problem with compute costs for training very large models, and making the training data more expensive only makes yhe problem worse. We need more free/open AI and less corporate controlled AI.
That problem has been solved many times over. Go check out the Google Maps API as an example. Small scale usage is free, with a generous enough margin for startups and academics. And there is a special arrangement that can be made for non profit use, by approval.