this post was submitted on 29 Nov 2024
84 points (96.7% liked)

Technology

60101 readers
3326 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

A group of Canadian news and media companies filed a lawsuit Friday against OpenAI, alleging that the ChatGPT maker has infringed their copyrights and unjustly enriched itself at their expense.

The companies behind the lawsuit include the Toronto Star, the Canadian Broadcasting Corporation, the Globe and Mail, and others who seek to win monetary damages and ban OpenAI from making further use of their work.

The news companies said that OpenAI has used content scraped from their websites to train the large language models that power ChatGPT — content that is “the product of immense time, effort, and cost on behalf of the News Media Companies and their journalists, editors, and staff.”

top 3 comments
sorted by: hot top controversial new old
[–] [email protected] 8 points 3 weeks ago* (last edited 3 weeks ago)

Good luck to them! It'd be interesting to see how they prove scraping. Like do you find something unique to your website & then prompt the model to give you just that? So you use the citation/reference features that link to your websites?

Knowing the slimeballs at OpenAI I'd wager they'd have covered their tracks.

EDIT To be clear, I'm not suggesting they deleted evidence, but they "laundered" the data via a public training dataset like Eluther AI's "the pile".

[–] [email protected] 1 points 3 weeks ago

The star, globe, and CBC. Cuts deep

[–] [email protected] 1 points 3 weeks ago

Ew they used postmedia to train their models? Yuck