this post was submitted on 16 Jun 2023
62 points (95.6% liked)
Lemmy.World Announcements
29022 readers
8 users here now
This Community is intended for posts about the Lemmy.world server by the admins.
Follow us for server news ๐
Outages ๐ฅ
https://status.lemmy.world
For support with issues at Lemmy.world, go to the Lemmy.world Support community.
Support e-mail
Any support requests are best sent to [email protected] e-mail.
Report contact
- DM https://lemmy.world/u/lwreport
- Email [email protected] (PGP Supported)
Donations ๐
If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.
If you can, please use / switch to Ko-Fi, it has the lowest fees for us
Join the team
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Meanwhile, companies that already copy the whole web for their search engines (e.g. Microsoft and Google) still don't need to make API calls to get Reddit posts.
More generally, anything that can be seen by a not-logged-in browser can be indexed by search engines and thus ingested into any training pipeline you can imagine.
This is part of my complaint against Reddit doing this. Google and Microsoft already have the data, they are just ensuring smaller companies and open source LLMs fail. I am also a little annoyed by the app thing, but I think it's important that we don't let tech giants monopolize this new technology.
I deleted my reddit post history, it's not their data to sell.