I recently ended up veering off a GDPR-and-PDS tangent into content moderation tangents (see thread here) and it lead me to do a little digging. This post serves as my list of observations; I may not be correct, and I hope the Bluesky devs will correct me if I'm wrong.

After a bit more research and tinkering, not everything is correct!

There's some shenanigans afoot that need further investigation but I have to wait for the appropriate moderation actions to come by so i can actually check and verify. The problem is that I'm not 100% sure, at this point, how 3rd party PDS's are treated in regards to content removal. To check that I need to wait until someone on a 3rd party PDS does something bad...

A quick recap of the situation (maybe not 100% correct, see additional notes below)

When Bluesky's labeler and moderation service perform moderation, there is the possibility that an account or post (but not blobs {images, video}, at least not directly) are taken down or hidden. A takedown will take the content down, but it seems that the difference between a "takedown" and a "hide" action isn't exactly explained anywhere. From what I've been able to piece together, a takedown will "suspend" an account, whereas a "hide" will outright delete the account. At the very least if you use Bluesky, viewing a profile that has been taken down will notify you it was suspended, where viewing a profile that was hidden will show you that the profile was not found.

This was verified by consuming the Bluesky mod service websocket for a while and finding account takedowns. The account (repo) on the Bluesky servers comes back as "taken down", whereas on a third party PDS the account (or repo) remains accessible. This leads me to believe that the same applies to content. And that's an even bigger problem, you don't want CSAM to remain on your server's drives, after all. The case could be made that you aren't aware of it, but in a lot of jurisdictions this doesn't fly as an excuse, and you are liable for a very fun time involving lawyers, police, and the courts if push comes to shove.

And knowing the internet, anyone operating a PDS has some form of bullseye on their back, whether they like it or not.

Additional notes...

So a bit more sleuthing and wrangling code led to the following, as mentioned above. There are 2 methods in which things can be "removed", there is !takedown and !hide - when applied to accounts, !takedown will "suspend" it (i.e. the repository will come back as "RepoTakendown"), whereas !hide apparently removes the repository altogether.

For content, the difference between the two actions isn't entirely clear, a post that has been flagged with !takedown can still be viewed using something like pdsls.dev - but not always. Sometimes the post is actually gone. The same applies to !hide, which also seems to occasionally remove a post altogether. Or it's been removed by the poster, that's also an option.

A proposed solution...

A solution I proposed (see also here) is that the PDS can be configured with a webhook (i.e. a URL pointing to an app you run on the web) where moderation notifications are sent, and depending on the action taken, the data sent along should include either the at:uri of a post, including all CID of all blobs embedded in the post, or the DID of the account if it's an account level action. Also a "reason" flag. Just seeing (for instance) a takedown doesn't indicate why the takedown is taking place. The idea being that a PDS operator can run a web service that queues these reports and where they can choose what action to take themselves - after all, in the future it's very much possible that a moderation takes place not on Bluesky but on an alternative app, and it leaves the operator in a position to either trust (implicitly or otherwise) the moderation report from a given source, or to not trust it at all.

But there's a problem there...

The PDS offers a few methods to deal with things; there is the com.atproto.admin.updateSubjectStatus endpoint which allows for the updating of statuses, including performing a takedown. A little browse through the code however shows that these actions appear to be reversible - they don't actually remove anything, they just flag it. Along with a few other problems that pop up if you do want to solve something "permanently":

deleteAccount (in various NSID namespaces) will remove the account/repo, but will not clean up the blobs associated with that account, it seems
there is no way to "cleanly" remove a blob from a repo. You could go in and remove it from disk (or the S3 bucket) but this leads to posts referencing blobs that no longer exist, which could be a problem. You could also overwrite the blob with another image, although I'm unsure this doesn't lead to errors. Both cases are "out of band" solutions that may or may not cause breakage in the future
as mentioned, takedowns seem to be reversible, for Bluesky this may be the desired operation, but as a third-party PDS operator, if an account is taken down, I expect it to be gone, without chances for parole or appeal - because to get your account taken down, you've either been posting things I want nothing to do with, or you've been a nuisance, and I don't want anything to do with that either.

Back to the proposed solution and the problem with that...

Given the above, the proposed solution would not work out since it does require out-of-band removals of things. The same problem occurs when you consume Bluesky's mod action websocket. Unfortunately there is no single-shot way to most definitely nuke an account from orbit. It can be done, but in a way that is not "defined" by ATproto, and this in general is a bad way to start doing things because sooner or later (and usually right when you don't want it to), it will blow up in a variety of interesting ways.

So what do I want?

Better integration with moderation. Post-haste. Like, yesterday. Why? Because if Bluesky really wants the ATproto to decentralize, running a third party PDS should not open you up to potential legal liability. It should not allow you to unknowingly host content that no sane person would want to be near. It's the one blocking reason why I haven't moved ahead with some of my own plans, because while chances for legal fuckery are slim, they are not zero. And in the end, I'm a one-man hobby show. I don't have the time, or money, to invest in lawyers to potentially fight a case where I, as hoster or operator, am held liable for content a user has uploaded (even if said user had his account taken down).

Thoughts...

While I have the utmost of respect for the Bluesky devs, I can't believe this hasn't been considered before allowing people to run their own PDS. Granted, right now I bet the majority of PDS's have a single user, yet this should have been considered way before now. And while I do see some things relating to this particular set of issues in various posts and comments on Github, there has been no uniform organised push towards a proper mechanism that can a) handle the moderation of awful content, and b) leave the final choice of whether or not to remove an account and/or content to the PDS operator.

For now, we'll have to wait and see.

Bluesky moderation from a PDS perpective: problems, and solutions, and yet more problems

@blockstackers.net

2024-12-08T18:56:32.714Z