Bluesky, and what Bluesky is not.

@alexia.bsky.cyrneko.eu

ko-fi ↑ Unrelated, but appreciated!


Over the past 2-4 days, as many users have joined BlueSky, I've seen many people go mad about it, doubt the protocol that it's using and it's design or spread FUD based on halt-truths they've heard about from others.

As someone that is running their own PDS, has looked into the protocol and it's federation model and is working on an AP In/outbox "attachment" for ATProto Personal Data Stores, this is especially frustrating as a lot of information that's spread is either outdated by multiple months or based on half-truths that got interpreted out of other's vague blogposts.

ATproto's federation model (and what it isn't)

First, let's get on some common ground here. ActivityPub, the ActivityStreams-based Social Media protocol that powers things like Mastodon, Peertube, Pixelfed or Misskey has a relatively simple model where the user profiles are federated directly between other servers in an explicit model, e.g explicitly contacting another instance because someone searched for it. From that point on, federation is implicit.

ATproto takes a more modular and generalised approach to user data storage that is not just intended for "global" microblogging (or whatever other specialised AP software you run, e.g peertube for videos).

ATproto is built to have the following qualities (and some more):

For the microblogging usecase, this means that there is a couple of core components:

  • Personal Data Servers
  • Relays ("Big Graph Server")
  • Frontend/API/Caching/The actual end product

Personal Data Servers store user data of multiple people at once, and can define defaults like moderation services, relays in use and other parameters depending on their needs.

The default configuration (as per https://github.com/bluesky-social/pds) features BlueSky's moderation.bsky.app as the moderation service, and bsky.app as the default frontend/AppView. Either of these are configurable.

Any PDS can request to be indexed/crawled by any given relay. The default for this is bsky.network - BlueSky's "official" relay.

A Relay aggregates data from any PDS that requests it into a firehose, a stream of data that can be subscribed to with WebSocket.

In the case of BlueSky, the clients are looking for data stored in the PDS under the app.bsky.* namespace, but there's also others for the likes of e.g linkat.blue (blue.linkat.*), whtwnd.com (com.whtwnd.*) or other ATproto-based software.

In this sense, user data like posts or profiles is not federated directly between data servers. Instead that data is loaded directly from the remote server on-demand, either directly or indirectly (through a relay or feed)

There are some usecases where relays are entirely unneeded though, like having an ATproto-based blog (see: https://haileyok.com), where the site is statically generated based on blogposts stored on the PDS.

For example, here is the PDS data versus the actual blogpost by haileyok.com. Source code is here: https://github.com/haileyok/blug. it is hosted 100% independently from both whtwnd (despite using the same namespace) and Bluesky/bsky.social.

To recap, the federation model is as follows:

  • Data stored in PDS - can be used directly from here
  • Relay aggregates data from any PDS that wishes to participate
  • Clients and/or AppViews can use data from Relay or from PDS
  • Data is not shared/duplicated/federated directly between PDSs

this means:

  • Data is not tied to particular software (e.g how posts would be tied to a fedi instance and the software it runs)
  • Standard data format that can be used by multiple AppViews using Namespaces
  • Namespaces do not clash with eachother (long-form content isn't federated to microblogging and vice-versa)

This also means that ATproto's federation model is NOT:

  • Instance-to-Instance
  • Centralized (anyone can run a PDS like anyone can run a webserver)
    • Currently, around ~600 or so people are self-hosting PDSs.
  • Algorithmically controlled (???)
    • (A PDS just has data, a relay just spits out that data chronologically. The PDS has no "algorithm")

The Moderation model

With ATproto, a "top-down" approach to moderation was chosen; Traditionally this means there is a central authority, but in this case the term is more about trusting any given entity to moderate and/or label content. In the context of BlueSky, this allows you to block thousands of users at once with shareable moderation lists, or apply labels to users and content using labelers and moderation services.

As mentioned previously, when setting up a PDS - at least for the Bluesky usecase - users signing up will by default be using moderation.bsky.app / did:plc:ar7c4by46qjdydhdevvrndac

Depending on the client/appview implementation, you may also be additionally subscribed to moderation-de.bsky.app or moderation-br.bsky.app for German and Brazilian users respectively - from what I can tell the client subscribes to these upon logging in from Germany/Brazil and this is not done on the PDS.

PDS Administrators can also set a different moderation service/labeler as the default if they so desire, like when they want to independently moderate, or want to run another community-run labeler, or any combination of labelers. Or god forbid, no moderation service.

This is comparable to moderation lists that are commonly shared on the fediverse of known bad-actor instances, with some key differences. Instead of being scoped to individual instances - or in this case PDSs - most moderation lists will be scoped to individual users. Additionally, because no data from other PDSs is incoming/replicated to yours, instead of prohibiting federation from a given domain/ip, you'd be pre-applying a modlist or labeler to all new users that contains any criteria you want people to be automatically blocked by, like a simple blocklist or other logic.

Additionally, Clients/AppViews can be hard-coded to always use certain Labelers and Moderation Services.

So, to recap:

  • Moderation is done with a "top-down" approach
  • Users can subscribe to Moderation lists and Content Labelers
  • PDS Admins have control over which default Moderation Lists and Labelers are subscribed to (this can be, to some extent, enforced)
  • In some cases, depending on your region, some clients might subscribe to regional moderation lists on register/first login (?)
  • Relays and AppViews are "unopinionated", whereas Labelers and other Moderation Services are "opinionated"

Feeds and Algorithms

Algorithmic Feeds, an opinionated 'filter' for content coming in from the firehose, is one of Bluesky's advertised features.

They use the name "Feeds" in the actual UI, and it allows people to curate content based on conditions they define.

The way this is achieved is by allowing an optional, extra layer between the Relay and Client/AppView, that can give out any number and order of posts they please.

Apart from the format of data they are expected to return, there is no restriction to how a feed operates. It can be as simple as filtering all posts with a certain hashtag to running computer vision algorithms on image attachments and Alt-Text to identify cat pictures.

With a feed, the structure would be:

  • AppView/Client
  • Feed
  • Relay
  • PDS

There's not much more for me to explain here that I didn't already in earlier sections. Just keep in mind that feeds are entirely optional and they do not affect the 'following' feed, which is generated in conjunction of only your PDS, Relay and AppView/Client.

On the state of BlueSky

Bluesky, in it's current state, is mostly centralised.

This is not because ATproto engineers or Bluesky developers are evil, but because their approach to designing the protocol was effectively speaking a journey of building out a traditional, scalable social media platform, and then seeing how it can be split up into individual components and defining a consistent data structure.

This and the late federation have lead most users to be on Bluesky PBC-owned Personal Data Servers.

Given the recent surge of Non-bsky-PBC PDSs appearing on the network (including my own!), I am hoping that in the future more people will be self-hosting their data or sharing personal PDSs with friends.

To reiterate, this is not good for federation, but it is recovering/improving now.

A direct comparison to the Fediverse for the Instance Admin

So, of course, with all that explanation out of the way I want to make a direct comparison.

With ActivityPub, because data will be incoming to your own server, and it takes the equivalent role of a Relay, AppView, Client and PDS all at once, you will also be taken over all the responsibility for any and all of those tasks. You will have to make sure nobody stores things against your rules, nobody shares it, none of it comes in and none of it goes out.

It also means you communicate directly with other people participating in the network, and store incoming data from them, but you already know that.

With ATproto, you can effectively forget your prior knowledge of how Federation works in ActivityPub as it entirely does not apply.

A PDS does not store incoming data from other PDSs. This means there is no need to worry about the content that other instances host being duplicated to yours. This eliminates the responsibility for you to moderate that incoming content directly, but of course it could still affect your users.

As such, two tools come into play: Moderation Lists and Content Labelers.

Moderation lists are effectively speaking just lists of users. they can be made by anyone, hosted anywhere and shared infinitely. You can subscribe to a moderation list and then either block or mute everyone on the list at once.

This suffices for the usecase of individuals, but you may want more granularity, so...

Content Labelers allow you to apply labels to both post contents and profiles/users. The logic for when any given label is applied and the available labels are entirely up to you. For example, you might apply a label for 'AI Art' to someone's profile if the labeler they've written determines it to be appropriate, or you may simply filter out all posts with certain phrases by default, or hide them behind a content warning. If you're hosting it yourself, you can also let a labeler do more 'destructive' actions on your PDS, like entirely taking down/banning a certain did, making them entirely unavailable to anyone on your PDS regardless of the Content Labelers they're subscribed to.

On that note, Account/did Takedowns can also be manually done by administrators by using the corresponding admin endpoints.


an FAQ

Relays provide a point of centralization, what about that!?

That is, indeed, an issue that I have criticised before multiple times and have been met with deaf (non-bluesky-developer) ears suggesting that a nonprofit, ICANN or other entities host a relay.

There is a serious problem with not having any incentive to host a relay, and I've never heard a developer or ATproto engineer talk about this problem properly. ← no longer true :P

There either needs to be an incentive to run a relay, which can be quite expensive for individuals (~150$/Month as of July 2024, probably more by now) or they need to be distributed/decentralized in one way or another.

Edit from Tuesday, October 22, 2024 12:33:04 GMT+2
For a comparison, mstdn.social is an instance that is currently home to 241,777 users, and reportedly costs roughly 1000€ per month to operate according to Stux, the instance admin. They're (at the time of writing) the 5th largest instance in the Fediverse. This is relevant context to put the price of a relay into perspective.

Though it is to be noted that other parts of the 'full stack', namely Labelers or (different) AppViews, provide more value overall yet still rely on relays. As such, one possible incentive for running one would be paying to receive an SLA from an existing Relay which subsidizes relay usage for the public. Kind of like how Proton(mail) subsidizes it's free customers with the subscriptions from the paid customers.

They are VC-Funded, they will inevitably screw over everyone!

I understand where this notion might come from, though, ATproto has already matured enough that it could be used without BlueSky PBC existing.

It could die tomorrow, and with little effort the ~650 non-bsky PDSs and a relay (which would have to handle much less data at that point) could be ran by volunteers and interested people.

More tight-knit federation could also be achieved through various means, but replicating the Bluesky setup shouldn't be hard.

Additionally, I am currently working on adding an ActivityPub In/Outbox to existing PDSs with minimal effort, to somewhat replicate the ActivityPods's "store data here, add in/outbox for federation" approach. Just instead of Solid Pods, the data would be stored in ATproto's format. This will take me time and effort, probably a lot more than it would most, so if you appreciate that you can support me on Ko-Fi or shoot me a message if you wanna help :)

You cannot migrate!

Wrong. You were able to for quite some time now, albeit the tooling is primarily aimed at those hosting a PDS and/or developers. Though unlike with ActivityPub, this isn't simply a redirect to a new actor, it actually migrates your entire did and your repository over to a new PDS.

I would assume that a migration UI in clients will come eventually; Though I would not necessarily get my hopes up.

But ActivityPub is better!

Go and discuss that on your own, I'm here to debunk common myths and/or FUD spread over the past few days, not disprove ActivityPub, which I use and enjoy.

BridgyFed is evil!

BridgyFed is a personal passion project made by someone that enjoys decentralised social media. Their website does ActivityPub and they're active with both protocols thanks to a bridge that they've made, for free, in their own time, despite having death threats and other horrible things directed at them primarily from people on the Fediverse.

I respect BridgyFed and see it as a good tool to bridge the gap (no pun intended), however I do believe a better approach overall is to allow PDSs to act as their own instances over ActivityPub, e.g giving them an In/Outbox to join the Fediverse directly.

But doesn't Jack Dorsey fund BlueSky!?

He used to, when the project was still freshly handed over from twitter, but Bluesky did too much moderation for what he liked, so he went to Nostr (which doesn't really even have any identity to represent itself with) instead.

If you don't know Nostr, imagine you put Cryptobros and the worst hate from Twitter into one pot.

What about the did:plc method?

did:plc was never intended to be a permanent solution, infact the "plc" initially stood for "placeholder". However, it has already been used in production for long enough that it has established itself in the ATproto ecosystem, and it's meaning has since been changed to "Public Ledger of Credentials"; Although this change is not on the site yet.

It relies on a centralised directory, plc.directory, but you are not required to use this did method. Some are using did:web for example.

If you've read this far, please note the fact that I HATE how I cannot change the sharing options, I'd make them disappear if I could
alexia.bsky.cyrneko.eu
Alexia (Alt, follow @cyrneko.eu)

@alexia.bsky.cyrneko.eu

testing a self-hosted PDS...

check @cyrneko.eu instead

avatar by @accelldraws.bsky.social

Post reaction in Bluesky

*To be shown as a reaction, include article link in the post or add link card

Reactions from everyone (0)