this post was submitted on 08 Aug 2023

1046 points (97.9% liked)

Privacy

31939 readers

961 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Posting a link to a website containing tracking isn't great, if contents of the website are behind a paywall maybe copy them into the post
Don't promote proprietary software
Try to keep things on topic
If you have a question, please try searching for previous discussions, maybe it has already been answered
Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
Be nice :)

Related communities

Chat rooms

[Matrix/Element]Dead
Discord

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago

MODERATORS

[email protected]

1046

The only way to avoid Grammarly using your data for AI is to pay for 500 accounts (yiffit.net)

submitted 1 year ago by [email protected] to c/[email protected]

83 comments fedilink hide all child comments

Source: https://front-end.social/@fox/110846484782705013

Text in the screenshot from Grammarly says:

We develop data sets to train our algorithms so that we can improve the services we provide to customers like you. We have devoted significant time and resources to developing methods to ensure that these data sets are anonymized and de-identified.

To develop these data sets, we sample snippets of text at random, disassociate them from a user's account, and then use a variety of different methods to strip the text of identifying information (such as identifiers, contact details, addresses, etc.). Only then do we use the snippets to train our algorithms-and the original text is deleted. In other words, we don't store any text in a manner that can be associated with your account or used to identify you or anyone else.

We currently offer a feature that permits customers to opt out of this use for Grammarly Business teams of 500 users or more. Please let me know if you might be interested in a license of this size, and I'II forward your request to the corresponding team.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 138 points 1 year ago (12 children)

In case anyone is interested in an alternative, I personally use LanguageTool because it is open source and works very well.

[–] [email protected] 14 points 1 year ago (2 children)

Per their website premium includes "Unlimited sentence paraphrasing powered by A.I." so I'm not sure they're an appropriate alternative to avoid the "AI" bullshit.

[–] [email protected] 21 points 1 year ago* (last edited 1 year ago) (1 children)

You can't avoid the AI "bullshit". It's like saying you want to avoid this portable phone craze. It's a tool.

[–] [email protected] 7 points 1 year ago (8 children)

I can avoid it like I've avoided cryptocurrency and NFTs. And it may be a "tool," but it's one built on the theft from and unpaid labor of tens of thousands of independent creators, and is nigh wholly controlled by corporate interests bent on eliminating those same independent creators whose data they stole to make their "tools." It should not exist. Not until it can be made in an ethical manner without harming the creatives necessary to make it.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

I don't buy the theft argument. Was reading books to my daughter to help them learn how to read theft? When we were working on parameters in the 60s to help a computer identify a balloon vs. a dog, was that theft? The corpulent (edit: LOL I guess that word works here in the "we have abundunce" sort of way, but I meant copyleft) side of me says if you put something out in public spaces, people are going to learn from it. If you don't want that, don't share it.

But even beyond that, parameters of learning are not copying, they are examples to develop data points on. Or in the case of imagery and something like stable diffusion it is math formulas developed in the 40s on how to make noise and then reverse that. Is that copying or theft?

I am willing to have the argument that AI is full of pitfalls. And that corporate control is not a good thing. I am struggling to see this theft.

[–] [email protected] 9 points 1 year ago (12 children)

It isn’t theft because the technology fundamentally steals. It’s theft because the people in control of the technology fundamentally steal.

I’m not talking about basement dwellers with a 3090 either. People using their m2 to generate lactating joe Biden fanfic aren’t the problem like multi-billion dollar companies taking advantage of the webs openness to train models that will be used to sell generative services replacing the creators of the stuff they were trained on are.

It’s the enclosures all over again.

Now when people speak out about it they’re called luddites and we don’t have the historical literacy to say “yes, I will smash this and any mill used to oppress me”.

load more comments (12 replies)

load more comments (7 replies)

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

I'm pretty sure most tools like this have to use ai to some degree to be more effective than something like Microsoft Word. I think the issue is more whether it's opt in or not to include your own data.

[–] [email protected] 8 points 1 year ago

Can confirm good drop in replacement. Also self hostable (to a point )

[–] [email protected] 4 points 1 year ago (2 children)

Does that have a chrome plugin?

[–] [email protected] 11 points 1 year ago

Yes.

[–] [email protected] 6 points 1 year ago* (last edited 1 year ago)

It even have a thunderbird plugin and works in all major editors.

You can self host it as well, which is how the editor plugins work by default.

[–] [email protected] 4 points 1 year ago

I took a quick look at this and it seems that the server portion of this product is open source but the apps such as extensions are not. I'm not saying it's bad or even that it's a red flag. I just felt like I should point it out.

load more comments (7 replies)

[–] [email protected] 107 points 1 year ago (3 children)

Yeah Grammarly was selling all your data LONG before the AI showed up.

Funny how some people are only nervous now that their data might be used to train a language model. I was always more worried about spooks! :)

[–] [email protected] 46 points 1 year ago* (last edited 1 year ago)

Companies selling consumer data for profit and marketeering: i sleep

Companies using consumer data to train AI models:
R E A L S H I T

[–] [email protected] 17 points 1 year ago (1 children)

True. Companies sell our data to third parties since forever, but some people are worried about it being used to train machine learning models? I'm far more concerned by people using it than AI.

load more comments (1 replies)

[–] [email protected] 11 points 1 year ago (1 children)

It's because certain companies are stirring the pot and manipulating. They want people mad so they can put restrictions on training AI, to stifle the open source scene.

[–] [email protected] 8 points 1 year ago (1 children)

OpenAI moment

[–] [email protected] 6 points 1 year ago (1 children)

They even named their company specifically to make it harder for open source ai to name themselves. Thats some dedication.

load more comments (1 replies)

[–] [email protected] 87 points 1 year ago* (last edited 1 year ago) (1 children)

I see you posted this article to 4 communities. According to the comments on this post if you use the cross post function (in the default web frontend), it will only show once in the feeds instead of 4 times (which can be a bit annoying).

Thanks

EDIT: post link and aclaration regarding the UI

[–] [email protected] 80 points 1 year ago (3 children)

I did use the cross-post function. Most apps do not currently acknowledge this function which might explain why the article has appeared to you multiple times.

[–] [email protected] 39 points 1 year ago (1 children)

Thanks! It seems this issue is harder than I thought :)

[–] [email protected] 21 points 1 year ago (3 children)

What is this healthy communication?! Aren't you supposed to go into the "what the fuck did you just say to me" ramble?

[–] [email protected] 3 points 1 year ago

What the fuck did you just say to them?

load more comments (2 replies)

[–] [email protected] 25 points 1 year ago

How much do you have to pay for them to not monitor your every keystroke, including all your IP and passwords?

Oh, that's their business model, right.

[–] [email protected] 23 points 1 year ago (1 children)

Even as someone who declines all cookies where possible on every site, I have to ask. How do you think they are going to be able to improve their language based services without using language learning models or other algorithmic evaluation of user data?

I get that the combo of AI and privacy have huge consequences, and that grammarly's opt-out limits are genuinely shit. But it seems like everyone is so scared of the concept of AI that we're harming research on tools that can help us while the tools which hurt us are developed with no consequence, because they don't bother with any transparency or announcement.

Not that I'm any fan of grammarly, I don't use it. I think that might be self-evident though.

[–] [email protected] 28 points 1 year ago (14 children)

Framing this solely as fear is extremely disingenuous. Speaking only for myself: I'm not against the development of AI or LLMs in general. I'm against the trained models being used for profit with no credit or cut given to the humans who trained it, willing or unwilling.

It's not even a matter of "if you aren't the paying customer, you're the product" - massive swaths of text used to train AIs were scraped without permission from sources whose platforms never sought to profit from users' submissions, like AO3. Until this is righted (which is likely never, I admit, because the LLM owners have no incentive whatsoever to change this behavior), I refuse to work with any site that intends to use my work to train LLMs.

load more comments (14 replies)

[–] [email protected] 16 points 1 year ago

They're honestly doing you a favor. Grammarly is terrible. I've seen some of my friends whose first language isn't English use it to try to clean their grammar up and it makes some really weird, often totally mistaken choices. Usually they would have been better off leaving it as they wrote it.

[–] [email protected] 9 points 1 year ago (4 children)

So some people want to use the advantages of AI that ONLY works properly because of all the user data collected... But refuse to contribute.

[–] [email protected] 28 points 1 year ago

I am perfectly fine with providing training data for AI, and have actually spent hours contributing to various projects. However, it is super scummy for a company to collect and use sensitive user data (literally a keylogger) not only without any form of communication or consent, but where the only way to opt out is to pay.

[–] [email protected] 22 points 1 year ago

Stuff like this should always be opt-in. It looks better on the company and builds trust.

Ideally, offer payment for users who opt-in to have their writing scraped and used to train AI.

Seems like this could easily be a win-win situation if they gave it a few seconds of thought.

[–] [email protected] 18 points 1 year ago (1 children)

Why do you assume everyone wants this garbage? We were fine without it.

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago)

That's basically what people said about:

mobile phones
The web
computers
calculators

I'd also argue that the customers of grammarly want this because they are paying for it. At least In the extension or app

[–] [email protected] 17 points 1 year ago

Uhh umm... You are the product! Aaand... Shill for greedy corporations!

I remember when Google said quite openly that they'd give us email addresses with more storage than we'd ever dreamed for life and in return, they'd scan the first few sentences of all messages and use them to target ads at us and we were all like, "Sounds fair."

[–] [email protected] 9 points 1 year ago (1 children)

I wonder if ProWritingAid is doing the same now. I always preferred them over Grammarly.

[–] [email protected] 8 points 1 year ago* (last edited 1 year ago)

They have a free tier and a $10/mo tier and prominently advertise their AI without any information about privacy. Guaranteed you and your text are the product being used to train their AI.

[–] [email protected] 8 points 1 year ago

Think about this every time you or a project you contribute to is using Microsoft GitHub instead of an open source offering (or self-hosted) or folks contributing to your permissive-licensed project living elsewhere while using Microsoft GitHub Copilot. All your projects and that force-push history clean up now belong to the Microsoft-owned AI that sells itself back to the developers that wrote all the code it trained on—no compensation, no recognition.

[–] [email protected] 8 points 1 year ago (2 children)

Any scope of privacy conscious users banding together to create a shell corp to pay for a business account? 500 users sounds doable. More the merrier, yeah?

[–] [email protected] 11 points 1 year ago

Alternatively, you switch to LanguageTool because it does the same thing but it is privacy minded.

[–] [email protected] 10 points 1 year ago

I think I'd rather not give the company money at all.

load more comments