this post was submitted on 19 Jan 2024

250 points (92.5% liked)

Technology

59648 readers

3170 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

250

A New York Times copyright lawsuit could kill OpenAI (www.vox.com)

submitted 10 months ago by [email protected] to c/[email protected]

90 comments fedilink hide all child comments

A New York Times copyright lawsuit could kill OpenAI::A list of authors and entertainers are also suing the tech company for damages that could total in the billions.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 68 points 10 months ago (19 children)

I always say this when this comes up because I really believe it's the right solution - any generative AI built with unlicensed and/or public works should then be free for the public to use.

If they want to charge for access that's fine but they should have to go about securing legal rights first. If that's impossible, they should worry about profits some other way like maybe add-ons such as internet connected AI and so forth.

[–] [email protected] 11 points 10 months ago (1 children)

There's plenty of money to be made providing infrastructure. Lots of companies make a ton of money providing infrastructure for open source projects.

On another note, why is open AI even called "open"?

[–] [email protected] 4 points 10 months ago

On another note, why is open AI even called “open”?

It's because of the implication...

[–] [email protected] 9 points 10 months ago (1 children)

Not really how it works these days. Look at Uber and Lime/Bird scooters. They basically would just show up to a city and say the hell with the law we are starting our business here. We just call it disruptive technology

[–] [email protected] 6 points 10 months ago

Unfortunately true, and the long arm of the law, at least in the business world, isn't really that long. Would love to see some monopoly busting to scare a few of these big companies into shape.

[–] [email protected] 6 points 10 months ago

A very compelling solution! Allows a model of free use while providing an avenue for business to spend time developing it

[–] [email protected] 3 points 10 months ago (2 children)

Nice idea but how do you propose they pay for the billions of dollars it costs to train and then run said model?

[–] [email protected] 5 points 10 months ago (12 children)

Then don't do it. Simple as that.

load more comments (12 replies)

[–] [email protected] 3 points 10 months ago* (last edited 10 months ago)

Defending scamming as a business model is not a business model.

load more comments (15 replies)

[–] [email protected] 62 points 10 months ago* (last edited 10 months ago)

I doubt it. It would likely kill any non Giant tech backed AI companies though

Microsoft has armies of lawyers and cash to pay. It would make life a lot harder, but they’d survive

[–] [email protected] 36 points 10 months ago (3 children)

If OpenAI owns a Copyright on the output of their LLMs, then I side with the NYT.

If the output is public domain--that is you or I could use it commercially without OpenAI's permission--then I side with OpenAI.

Sort of like how a spell checker works. The dictionary is Copyrighted, the spell check software is Copyrighted, but using it on your document doesn't grant the spell check vendor any Copyright over it.

I think this strikes a reasonable balance between creators' IP rights, AI companies' interest in expansion, and the public interest in having these tools at our disposal. So, in my scheme, either creators get a royalty, or the LLM company doesn't get to Copyright the outputs. I could even see different AI companies going down different paths and offering different kinds of service based on that distinction.

[–] [email protected] 12 points 10 months ago

I want people to take my code if they share their changes (gpl). Taking and not giving back is just free labor.

[–] [email protected] 5 points 10 months ago

I think it currently resides with the one doing the generation and not openAI itself. Officially it is a bit unclear.

Hopefully, all gens become copyleft just for the fact that ais tend to repeat themselves. Specific faces will pop up quite often in image gen for example.

load more comments (1 replies)

[–] [email protected] 26 points 10 months ago* (last edited 10 months ago) (1 children)

The NYT has a market cap of about $8B. MSFT has a market cap of about $3T. MSFT could take a controlling interest in the Times for the change it finds in the couch cushions. I’m betting a good chunk of the c-suites of the interested parties have higher personal net worths than the NYT has in market cap.

I have mixed feelings about how generative models are built and used. I have mixed feelings about IP laws. I think there needs to be a distinction between academic research and for-profit applications. I don’t know how to bring the laws into alignment on all of those things.

But I do know that the interested parties who are developing generative models for commercial use, in addition to making their models available for academics and non-commercial applications, could well afford to properly compensate companies for their training data.

[–] [email protected] 20 points 10 months ago* (last edited 10 months ago) (2 children)

deleted

[–] [email protected] 7 points 10 months ago

Or Musk when he decided he didn't like what people were saying on Twitter.

[–] [email protected] 4 points 10 months ago

I completely agree. I don’t want them to buy out the NYT, and I would rather move back to the laws that prevented over-consolidation of the media. I think that Sinclair and the consolidated talk radio networks represent a very real source of danger to democracy. I think we should legally restrict the number of markets a particular broadcast company can be in, and I also believe that we can and should come up with an argument that’s the equivalent of the Fairness Doctrine that doesn’t rest on something as physical and mundane as the public airwaves.

[–] [email protected] 20 points 10 months ago

Oh no, how terrible. What ever will we do without Shenanigans Inc. 🙄

[–] [email protected] 18 points 10 months ago

YES! AI is cool I guess, but the massive AI circlejerk is so irritating though.

If OpenAI can infringe upon all the copyrighted material on the net then the internet can use everything of theirs all for free too.

[–] [email protected] 16 points 10 months ago* (last edited 10 months ago) (16 children)

This would bring up the cost of entry for making a model and nothing more. OpenAI will buy the data if they have too and so will google. The money will only go to the owners of the New York Times and its shareholders, none of the journalists who will be let go in the coming years will see a dime.

We must keep the entry into the AI game as low as possible or the only two players will be Microsoft and Google. And as our economy becomes increasingly AI driven, this will cement them owning it.

Pragmatism or slavery, these are the two options.

[–] [email protected] 10 points 10 months ago* (last edited 10 months ago)

[email protected] is deleting their comments and reposting the same comment to dodge replies. Link to the last thread.

load more comments (15 replies)

[–] [email protected] 13 points 10 months ago

inshallah

[–] [email protected] 9 points 10 months ago

Oh no. Anyways.

[–] [email protected] 8 points 10 months ago (3 children)

Is there a possible way that both the NYT and OpenAI could lose?

[–] [email protected] 4 points 10 months ago

NYT loses even if they win.

While id love to see Openai forced to take a step back ai isn't going away.

Journalism will have to adapt or it will get replaced, just like so many jobs, including my own.

[–] [email protected] 3 points 10 months ago

Not without a bunch of lawyers winning.

load more comments (1 replies)

[–] [email protected] 5 points 10 months ago* (last edited 10 months ago) (1 children)

The problem with copyright is that everything is automatically copyrighted. The copyright logo is purely symbolic, at this point. Both sides are technically right, even though the courts have ruled that anything an AI outputs is actually in the public domain.

[–] [email protected] 3 points 10 months ago (1 children)

Works involving the use of AI are copyrightable. Also, the Copyright Office's guidance isn’t law. Their guidance reflects only the office’s interpretation based on its experience, it isn’t binding in the courts or other parties. Guidance from the office is not a substitute for legal advice, and it does not create any rights or obligations for anyone. They are the lowest rung on the ladder for deciding what law means.

[–] [email protected] 2 points 10 months ago (1 children)

I wasn't talking about Copyright Office. I was talking about the courts.

[–] [email protected] 2 points 10 months ago* (last edited 10 months ago) (4 children)

This ruling is about something else entirely. He tried to argue that the AI itself was the author and that copyright should pass to him as he hired it.

An excerpt from your article:

In 2018, Dr. Thaler sought to register "Recent Entrance" with the U.S. Copyright Office, listing the Creativity Machine as its author. He claimed that ownership had been transferred to him under the work-for-hire doctrine, which allows the employer of the creator of a given work or the commissioner of the work to be considered its legal author. However, in 2019, the Copyright Office denied copyright registration for "Recent Entrance," ruling that the work lacked the requisite human authorship. Dr. Thaler requested a review of his application, but the Copyright Office once more refused registration, restating the requirement that a human have created the work.

load more comments (4 replies)

[–] [email protected] 4 points 10 months ago

Don’t threaten me with a good time!

[–] [email protected] 3 points 10 months ago

This is the best summary I could come up with:

Late last year, the New York Times sued OpenAI and Microsoft, alleging that the companies are stealing its copyrighted content to train their large language models and then profiting off of it.

Meanwhile, the Senate Judiciary Subcommittee on Privacy, Technology, and Law held a hearing in which news executives implored lawmakers to force AI companies to pay publishers for using their content.

In its rebuttal, OpenAI said that regurgitation is a “rare bug” that the company is “working to drive to zero.” It also claims that the Times “intentionally manipulated prompts” to get this to happen and “cherry-picked their examples from many attempts.”

A growing list of authors and entertainers have been filing lawsuits since ChatGPT made its splashy debut in the fall of 2022, accusing these companies of copying their works in order to train their models.

Developers have sued OpenAI and Microsoft for allegedly stealing software code, while Getty Images is embroiled in a lawsuit against Stability AI, the makers of image-generating model Stable Diffusion, over its copyrighted photos.

In that 2013 decision, Judge Chin said its technology “advances the progress of the arts and sciences, while maintaining respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders.” And a 2023 economics study of the effects of Google Books found that “digitization significantly boosts the demand for physical versions” and “allows independent publishers to introduce new editions for existing books, further increasing sales.” So consider that another point in favor of giving tech platforms room to innovate.

The original article contains 1,628 words, the summary contains 259 words. Saved 84%. I'm a bot and I'm open source!

load more comments