195

Stanford researchers find Mastodon has a massive child abuse material problem (www.theverge.com)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

233 comments fedilink hide all child comments

Not a good look for Mastodon - what can be done to automate the removal of CSAM?

top 50 comments

sorted by: hot top controversial new old

[-] [email protected] 170 points 1 year ago* (last edited 1 year ago)

https://stacks.stanford.edu/file/druid:vb515nd6874/20230724-fediverse-csam-report.pdf

I'd suggest that anyone who cares about the issue take the time to read the actual report, not just drama-oriented news articles about it.

[-] [email protected] 61 points 1 year ago

So if I'm understanding right, based on their recommendations this will all be addressed as more moderation and QOL tools are introduced as we move further down the development roadmap?

load more comments (3 replies)

[-] [email protected] 50 points 1 year ago

If I can try to summarize the main findings:

Computer-generated (e.g.., Stable Diffusion) child porn is not criminalized in Japan, and so many Japanese Mastodon servers don't remove it
Porn involving real children is removed, but not immediately, as it depends on instance admins to catch it, and they have other things to do. Also, when an account is banned, the Mastodon server software is not sending out a "delete" for all of their posted material (which would signal other instances to delete it)

Problem #2 can hopefully be improved with better tooling. I don't know what you do about problem #1, though.

[-] [email protected] 27 points 1 year ago

One option would be to decide that the underlying point of removing real CSAM is to avoid victimizing real children; and that computer-generated images are no more relevant to this goal than Harry/Draco slash fiction is.

load more comments (13 replies)

[-] [email protected] 11 points 1 year ago

Such a signal exists in the ActivityPub protocol, so I wonder why it's not being used.

load more comments (2 replies)

[-] [email protected] 29 points 1 year ago

4.1 Illustrated and Computer-Generated CSAM

Stopped reading.

Child abuse laws "exclude anime" for the same reason animal cruelty laws "exclude lettuce." Drawings are not children.

Drawings are not real.

Half the goddamn point of saying CSAM instead of CP is to make clear that Bart Simpson doesn't count. Bart Simpson is not real. It is fundamentally impossible to violate Bart Simpson's rights, because he doesn't fucking exist. There is nothing to protect him from. He cannot be harmed. He is imaginary.

This cannot be a controversial statement. Anyone who can't distinguish fiction from real life has brain problems.

You can't rape someone in MS Paint. Songs about murder don't leave a body. If you write about robbing Fort Knox, the gold is still there. We're not about to arrest Mads Mikkelsen for eating people. It did not happen. It was not real.

If you still want to get mad at people for jerking off to the wrong fantasies, that is an entirely different problem from photographs of child rape.

load more comments (74 replies)

[-] [email protected] 146 points 1 year ago

Mastodon is a piece of software. I don't see anyone saying "phpBB" or "WordPress" has a massive child abuse material problem.

Has anyone in the history ever said "Not a good look for phpBB"? No. Why? Because it would make no sense whatsoever.

I feel kind of a loss for words because how obvious it should be. It's like saying "paper is being used for illegal material. Not a good look for paper."

What is the solution to someone hosting illegal material on an nginx server? You report it to the authorities. You want to automate it? Go ahead and crawl the web for illegal material and generate automated reports. Though you'll probably be the first to end up in prison.

[-] [email protected] 34 points 1 year ago

I get what you're saying, but due to federated nature, those CSAMs can easily spread to many instances without their admins noticing them. Having even one CSAM in your server is a huge risk for the server owner.

[-] [email protected] 30 points 1 year ago

I don't see what a server admin can do about it other than defederate the instant they get reports. Otherwise how can they possibly know?

load more comments (3 replies)

load more comments (2 replies)

[-] [email protected] 78 points 1 year ago

According to corporate news everything outside of the corporate internet is pedophiles.

[-] [email protected] 30 points 1 year ago

Well, terrorists became boring, and they still want the loony wing of the GOP's clicks, so best to back off on Nazis and pro-Russians, leaving pedophiles as the safest bet.

load more comments (2 replies)

[-] [email protected] 58 points 1 year ago

Hasn’t Twitter had the same problem for years?

https://www.iwf.org.uk/news-media/news/iwf-publishes-platform-specific-data-for-child-sexual-abuse-imagery/

[-] [email protected] 47 points 1 year ago

These articles are written by idiots, serving the whims of a corporate stooge to try and smear any other than corporate services and it isn't even thinly veiled. Look at who this all comes from

[-] [email protected] 8 points 1 year ago

The article written by WaPo and regurgitated by The Verge is crap, but the study from Stanford is solid. However, it's nowhere near as doom and gloom as the articles, and suggests plenty of ways to improve things. Primarily they suggest better tools for moderation.

[-] [email protected] 7 points 1 year ago

better tools for moderation

Where have I heard that before?

load more comments (1 replies)

[-] [email protected] 6 points 1 year ago

Its weird how this headline shows up only when other headlines start covering how popular Mastadon is now.

Coincidence? Sure smells like it. God, I love astroturfing in the morning.

[-] [email protected] 41 points 1 year ago* (last edited 1 year ago)

Direct link to the (short) report this article refers to:

https://stacks.stanford.edu/file/druid:vb515nd6874/20230724-fediverse-csam-report.pdf

https://purl.stanford.edu/vb515nd6874

After reading it, I’m still unsure what all they consider to be CSAM and how much of each category they found. Here are what they count as CSAM categories as far as I can tell. No idea how much the categories overlap, and therefore no idea how many beyond the 112 PhotoDNA images are of actual children.

112 instances of known CSAM of actual children, (identified by PhotoDNA)
713 times assumed CSAM, based on hashtags.
1,217 text posts talking about stuff related to grooming/trading. Includes no actual CSAM or CSAM trading/selling on Mastodon, but some links to other sites?
Drawn and Computer-Generated images. (No quantity given, possibly not counted? Part of the 713 posts above?)
Self-Generated CSAM. (Example is someone literally selling pics of their dick for Robux.) (No quantity given here either.)

Personally, I’m not sure what the take-away is supposed to be from this. It’s impossible to moderate all the user-generated content quickly. This is not a Fediverse issue. The same is true for Mastodon, Twitter, Reddit and all the other big content-generating sites. It’s a hard problem to solve. Known CSAM being deleted within hours is already pretty good, imho.

Meta-discussion especially is hard to police. Based on the report, it seems that most CP-material by mass is traded using other services (chat rooms).

For me, there’s a huge difference between actual children being directly exploited and virtual depictions of fictional children. Personally, I consider it the same as any other fetish-images which would be illegal with actual humans (guro/vore/bestiality/rape etc etc).

load more comments (1 replies)

[-] [email protected] 39 points 1 year ago

Seems odd that they mention Mastodon as a Twitter alternative in this article, but do not make any mention of the fact that Twitter is also rife with these problems, more so as they lose employees and therefore moderation capabilities. These problems have been around on Twitter for far longer, and not nearly enough has been done.

[-] [email protected] 11 points 1 year ago* (last edited 1 year ago)

The actual report is probably better to read.

It points out that you upload to one server, and that server then sends the image to thousands of others. How do those thousands of others scan for this? In theory, using the PhotoDNA tool that large companies use, but then you have to send the every image to PhotoDNA thousands of times, once for each server (because how do you trust another server telling you it's fine?).

The report provides recommendations on how servers can use signatures and public keys to trust scan results from PhotoDNA, so images can be federated with a level of trust. It also suggests large players entering the market (Meta, Wordpress, etc) should collaborate to build these tools that all servers can use.

Basically the original report points out the ease of finding CSAM on mastodon, and addresses the challenges unique to federation including proposing solutions. It doesn't claim centralised servers have it solved, it just addresses additional challenges federation has.

load more comments (2 replies)

[-] [email protected] 38 points 1 year ago* (last edited 1 year ago)

“We got more photoDNA hits in a two-day period than we’ve probably had in the entire history of our organization of doing any kind of social media analysis, and it’s not even close,”

How do you have "probably" and "it's not even close" in the same sentence?

Here's the thing, and what I've been saying for a long time about The Fediverse:

I don't care what platform you have, if it is sufficiently popular, you're GOING to have CSAM. You're going to have alt-right assholes. You're going to have transphobia, you're going to have racism and every other kind of discrimination.

People point fingers at Meta for "allowing" this but there's no amount of money that can reasonably moderate 3 b-b-billion users. Meta, and probably every other platform that's not Twitter or False social, does what they can about this.

Masto and Fedi admins need to be cognizant of the amount of users on their instances and need to have a sufficient number of moderators to manage those users. If they don't have them, they need to close registrations.

But ultimately the Fediverse can also create safe-havens for these sorts of things. Making it easy to set up a discriminatory network that has no outside moderation. This is the downside of free speech.

[-] [email protected] 8 points 1 year ago

Heck, Truth Social uses Mastodon, IIRC.

Ultimately, it's software. Even if my home instance does a good job of enforcing it's CoC, and every instance it federated with does as well, someone else can spin up their own instance, load up on whatever, and I'll never know or even be aware if it's never federated with my instance.

load more comments (1 replies)

load more comments (6 replies)

[-] [email protected] 36 points 1 year ago

This is one of the reasons I'm hesitant to start my own instance - the moderation load expands exponentially as you scale, and without some sort of automated tool to keep CSAM content from being posted in the first place, I can only see the problem increasing. I'm curious to see if anyone knows of lemmy or mastodon moderation tools that could help here.

That being said, it's worth noting that the same Standford research team reviewed Twitter and found the same dynamic in play, so this isn't a problem unique to Mastodon. The ugly thing is that Twitter has (or had) a team to deal with this, and yet:

“The investigation discovered problems with Twitter's CSAM detection mechanisms and we reported this issue to NCMEC in April, but the problem continued,” says the team. “Having no remaining Trust and Safety contacts at Twitter, we approached a third-party intermediary to arrange a briefing. Twitter was informed of the problem, and the issue appears to have been resolved as of May 20.”

Research such as this is about to become far harder—or at any rate far more expensive—following Elon Musk’s decision to start charging $42,000 per month for its previously free API. The Stanford Internet Observatory, indeed, has recently been forced to stop using the enterprise-level of the tool; the free version is said to provide read-only access, and there are concerns that researchers will be forced to delete data that was previously collected under agreement.

So going forward, such comparisons will be impossible because Twitter has locked down its API. So yes, the Fediverse has a problem, the same one Twitter has, but Twitter is actively ignoring it while reducing transparency into future moderation.

[-] [email protected] 18 points 1 year ago

If you run your instance behind cloudlare, you can enable the CSAM scanning tool which can automatically block and report known CSAMs to authorities if they're uploaded into your server. This should reduce your risk as the instance operator.

https://developers.cloudflare.com/cache/reference/csam-scanning/

load more comments (6 replies)

[-] [email protected] 9 points 1 year ago* (last edited 1 year ago)

I think the common sense solution is creating instances for physically local communities (thus keeping the moderation overhead to a minimum) and being very judicious about which instances you federate your instance with.

That being said, It's only a matter of time before moderation tools are created for streamlining the process.

[-] [email protected] 9 points 1 year ago

My instance is for members of a certain group, had to email the owner a picture of your card to get in. More instances should exist like that. General instances are great but it's nice knowing all the people on my local are in this group too.

load more comments (4 replies)

[-] [email protected] 35 points 1 year ago* (last edited 1 year ago)

Nothing you can do except go after server owners like usual. Has nothing to do with the fedi. Mastodon has nothing to do with either because anyone can pop up their own alternative server. This is one of many protocols they have or will use to distribute this stuff.

This just in: criminals are using the TCP protocol to distribute CP!!! What can the internet do to stop this? Oh yeah, go after server owners and groups like usual.

[-] [email protected] 11 points 1 year ago* (last edited 1 year ago)

Things are a bit complicated in the fediverse. Sure, your instance might not host any pedo community, but if a user on your instance subscribe/interact with those community, the CSAMs might get federated into your instance without you noticing. There are tools to help you combat this, but as an instance owner you can't just assume it's not your problem if some other instance host pedo stuff.

[-] [email protected] 6 points 1 year ago

That is definitely alarming, and a downside of the fedi, but seems like a necessary evil. Unfortunately admins and mods of small communties in the fedi will be the ones exposed to this. There has been better methods if handling this though. There are shared block lists out there and they already have lists that block out undesirable stuff like that, so it at least minimizes the amount of innocent eyes of mods, who are just regular unpaid people, from seeing disgusting stuff. Also, obviously those instances should be reported to the police, fbi, or whatever the heck

load more comments (2 replies)

[-] [email protected] 17 points 1 year ago

I'm not actually going to read all that, but I'm going to take a few guesses that I'm quite sure are going to be correct.

First, I don't think Mastodon has a "massive child abuse material" problem at all. I think it has, at best, a "racy Japanese style cartoon drawing" problem or, at worst, an "AI generated smut meant to look underage" problem. I'm also quite sure there are monsters operating in the shadows, dogwhistling and hashtagging to each other to find like minded people to set up private exchanges (or instances) for actual CSAM. This is no different than any other platform on the Internet, Mastodon or not. This is no different than the golden age of IRC. This is no different from Tor. This is no different than the USENET and BBS days. People use computers for nefarious shit.

All that having been said, I'm equally sure that this "research" claims that some algorithm has found "actual child porn" on Mastodon that has been verified by some "trusted third part(y|ies)" that may or may not be named. I'm also sure this "research" spends an inordinate amount of time pointing out the "shortcomings" of Mastodon (i.e. no built-in "features" that would allow corporations/governments to conduct what is essentially dragnet surveillance on traffic) and how this has to change "for the safety of the children."

How right was I?

[-] [email protected] 16 points 1 year ago

The content in question is unfortunately something that has become very common in recent months: CSAM (child sexual abuse material), generally AI-generated.

AI is now apparently generating entire children, abusing them, and uploading video of it.

Or, they are counting "CSAM-like" images as CSAM.

[-] [email protected] 13 points 1 year ago

Of course they're counting "CSAM-like" in the stats, otherwise they wouldn't have any stats at all. In any case, they don't really care about child abuse at all. They care about a platform existing that they haven't been able to wrap their slimy tentacles around yet.

[-] [email protected] 9 points 1 year ago

Halfway there. The PDF lists drawn 2D/3D, AI/ML generated 2D, and real-life CSAM. It does highlight the actual problem of young platforms with immature moderation tools not being able to deal with the sudden influx of objectional content.

load more comments (2 replies)

[-] [email protected] 17 points 1 year ago

I'm always suspicious if someone argues pro Contents Filter with "protection of children" as the main argument...

load more comments (1 replies)

[-] [email protected] 14 points 1 year ago

So what im reading is they didnt actually look at any images, they found hashtags, undisclosed hashtags at that. So basically we've no idea what they think they found, for all we know cartoon might've been one of the tags

[-] [email protected] 12 points 1 year ago

I bet theres more CP hosted by Bing.

[-] [email protected] 10 points 1 year ago

Don't bother with the bullshit clickbait article. Honestly, don't give them the views.

The underlying study is good though, and worth reading.

https://purl.stanford.edu/vb515nd6874

[-] [email protected] 9 points 1 year ago* (last edited 1 year ago)

Is this Blahaj.zone admin "child abuse material" or actual child abuse material?

load more comments (1 replies)

[-] [email protected] 8 points 1 year ago

This seems like a very normal thing with all social media. Now if the server isn't banning and removing the content within a reasonable amount of time then we have major issues.

Seems like if you talk about Mastodon but not Twitter or Facebook in the same post it makes it feel like one is greater than the others. This article seems half banked to get clicks.

[-] [email protected] 7 points 1 year ago

I know that people like to dump on Cloudflare, but it's incredibly easy to enable a built-in CSAM scanner with CloudFlare.

On that note, I'd like to see built-in moderation tools using something like PDQ and TMK+PDQF and a shared hashtable of CSAM and other material that may be outlawed or desirable to filter out in different regions (e.g. terrorist content, Nazi content in Germany, etc.)

[-] [email protected] 7 points 1 year ago

I don't want much, I just want deletion to be propagated reliably across the fediverse. If someone got banned for CSAM and their contents purged, I want those action propagated across all federated instances. I can't even delete my comment reliably here on Lemmy since many instances doesn't seem to get the deletion requests.

load more comments (10 replies)

load more comments

this post was submitted on 24 Jul 2023

195 points (79.5% liked)

Technology

34436 readers

111 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

[email protected]