Bird Half-Baked

Embrace chaotic progression

Exploring Bluesky (part 1)

Share post

Bluesky! Jack Dorsey’s passion project, A.K.A. the hot new social media platform people are making a buzz about. Okay, so it’s not necessarily new, but it is recently opened to the public. Furthermore, there has been a recent influx onto the platform (including for myself, since my FOMO knows no bounds). There is a lot of fear around content ownership especially among artists, and with the latest changes on X regarding their policy for content use in generative AI model training, I think that Bluesky is a breath of fresh air for everyone. I think it is a very cool experiment and I do support what the company is trying to achieve, which is to create an open, decentralized way to manage social media posts where even more control is possible for creators.

I don’t feel it will be practical to address all of my exploration of the platform in one post, so I want to break this up into more edible pieces. While this is not a strict schema, I want to sort of break things down into the following posts that get progressively more technical.

  • Post 1 (this one): General intro to Bluesky and a quick look at the decentralization model and what it aims to achieve
  • Post 2: Deeper dive into ATProto, and the challenges of decentralization
  • Post 3: Shall we create our own server? (we shall)

This might change, but I think the approach of general -> more technical is one that is more friendly to all.

If you are reading this and use Bluesky, feel free to skip this one, you probably already know all the content here.

Closer Look at Bluesky

What is Bluesky?

Bluesky, as mentioned in the lead at the top, is a twitter clone that was started by Jack Dorsey. His goal when creating this was to create first and foremost, a decentralized way of posting content. So really there are two parts to Bluesky, the application that you use as a user, and the cool tech stuff behind the scenes. I’ll get to the more technical stuff further below (and the new protocol Bluesky helped invent called ATProto). For now it is worth taking a look at just the application as a gentler intro.

So looking at Bluesky as it is currently, there will be a lot of nostalgia for those that grew up on OG Twitter. The interface is simplified, and there aren’t a lot of feed interruptions like ads or community notes. I’m not going to go into any particular detail, but rather just the things that I noticed, whether it was stuff that stayed the same, or things that were not the same and were noteworthy in my eyes.

Simple panels feel nicer

Search, notifications, etc. etc. Comparing to the predecessor, you can really start to appreciate the lack of bloat in just the left pane alone.

When seen side by side, the comparison is fairly stark. I am of the opinion that if your side panel has a “More” option, you’ve already lost the UX battle because no one can know what context “More” contains. That being said I am not a UX engineer, just a user so my opinion is probably invalid.

The content is still organized in the front and center of the page, and feeds are on the right side. In terms of layout, there is not much difference to note here in terms of high level shaping between Twitter/X and Bluesky.

Content and profile layout also remains largely the same

Comparing to the predecessor, and many other timeline-based content platforms, there is a lot of familiarity. I.e. I don’t think Bluesky wanted to deviate from twitter’s layout and just copied it in. There are of course minor tweaks to the icons, but that’s about it.

Reposts and quoting works the same too. There is some slight difference again on appearance of icons and styling, but the features will be immediately usable from those that used Twitter/X. Profile also looks similar, though with some extra tabs like “Feeds” and “Starter Packs”.

Down with Trends, up with Feeds

On twitter, the things you see on the right pane generally focus on “Trends” that are current based on some geolocation info you provide as well as some generalized popularity score. BlueSky tends to instead focus on feeds. This is more similar to the Reddit style of compartmentalization where you have streams of content you subscribe too. Then there is the BlueSky official feed which everyone is subscribed to by default and this is where all posts seem to land. This is great because it helps with cold-starting users on the platform that may not have users to follow yet.

Yes you can still message each other

While there is not yet a group messaging feature, the “chat” function is the way users can DM others. It works, and I played around with it as much as I could, but I don’t really have anyone to message so unfortunately not much ground there, other than what it looks like to start a conversation.

Largely automated moderation and classification

It’s not a secret Bluesky uses Hive AI for moderations, but it ALSO does general labeling it seems through the use of Hive AI’s models. This is neat because it means that behind the scenes, when Bluesky handles a post creation, it applies labeling to be able to make use of recommendations and discovery algorithms (and moderation) that make the user experience a lot nicer for search. And truly, in platforms like this, discovery HAS to be a first-class feature.

In fact when you sign up you will see a screen asking you to provide topics you are interested in. This is to bootstrap what I assume is a content recommendations engine (speaking from experience, cold start issues are a pain to deal with in any platform that does recommendations).

For Moderation there is the standard block lists, and content muting. However, content filters struck me as being particularly elegant. There are some very useful toggles that are shown by default, and adult content must be opted into. This is a good choice in my opinion.

I like to live dangerously, so I choose to have all adult content shown to me upfront. What could go wrong on a site full of artists? Anyway, I also think it’s really cool that on the more advanced settings of the moderation system, you can see why there is at least for now less spam being flooded into users’ feeds. It all comes down to auto-moderation and I think Bluesky has nailed it so far, though the real test is time and usage to see how this holds up.

Back to limited character count

No options to buy more content length for now. Bluesky has a hard limit on 300 characters and that’s that! Granted, there is the ability to post images, and of course do the same trick of adding comments to your own post to extend it (threading). Still… for some users that are used to the paid functionality of X/Twitter, and the autothreading, this is a bit of an annoyance. For me though it is probably a good limitation so I don’t blabber.

There is a mobile app

To be honest I have not played around enough with the mobile app nor do I care to. Some functionality does ask the user to go to the web interface to make adjustments, but that said, the interface seems largely good enough and has a good feel for moving around the different panel contexts.

Overall there are a wealth of very well-designed features playing together in a very nice way. While again I am not very big on any platform, I can appreciate how this new platform must feel for a content creator that has had to deal with bloat and other pains of the other platforms.

What makes Bluesky truly unique?

Remember there were two parts to Bluesky: the user interface you use, and ALSO the cool technical protocol invented.

So far I looked at the features that you as a user see. However, these are not in themselves very unique and many components can be found across other platforms, though again I do believe the mix of these components is being done so much better in Bluesky so far.

No, the magic in what maked Bluesky unique is in the way it manages content behind the scenes. I don’t mean the fancy moderation and content-labeling tools either. I mean their use of the protocol Authenticated Transfer Protocol (ATProto for short). This enables Bluesky to operate in a decentralized manner when serving and managing content from users.

Quick footnote: A protocol is just a fancy term for a standard way of communicating. Whether between humans, or machines, protocols are just standards.

What is decentralization, what does it mean for social media?

Decentralization, noun – the transfer of control of an activity or organization to several local offices or authorities rather than one single one.

Okay so that is a pretty unhelpful definition by itself. Let’s start with how things are today instead. Today, social media platforms like Facebook (Meta), Twitter (X), Twitch, etc. all use a content model called “centralization.” This means that there is one central system that controls everything.

Wait, decentralization or federation?
Now, I know that some docs mention that Bluesky federates, and doesn’t decentralize. The difference is that in the current Bluesky setup, there is a single global relay that is aware of all feeds and content servers. This single relay seems to be “in charge” of some gating control of what content servers are indeed enabled in the network and while the specification allows for open crawling, as far as I can tell this is somewhat monitored and reviewed. This is federation, but the protocol Bluesky developed does not require this setup per se so I will ignore the semantics around that.

While the reality is that many of these platforms are complex and made of many many small components working in tandem, all the control comes from a single authority, which maintains everything. And that authority of course can choose to do what it wants with your content. Therein lies the problem many content creators have with existing platforms. Lack of actual control of their own content. With today’s landscape of AI-generated content, quickly-changing political views and controls, and the worry about which audiences can consume what content, control of content as a creator is more important than ever and this is why people are concerned with the current status quo.

Now let’s look at the decentralized case. This term got a lot of hype behind it after the crypto boom, but it’s been around since before even computers. Practically speaking, it is just taking that central store of content, and then enabling that to be managed by completely separate components that come together by some agreement/protocol when needed (often called consensus, but that’s not a true catch-all for what can happen here).

Typically platforms that decentralize define a protocol that is the glue of all things. This protocol can be agreed to by clients, and as long as something adheres to that, it can integrate with the other servers or systems involved. No single entity manages all the data and if you make the protocol open, no single authority can control all implementations. There are caveats, but I will keep this simple since complication would just hurt at this point. As a content creator, this means you can choose which server you host your content on. If the protocol is really good, it will bake in some implementations of controls that you would want to use, and because of the decentralized nature of this approach, no single authority could outright decide things like we’ve seen in.. well it doesn’t matter. Granted, this does not prevent bad actors from deviating in ways that are convenient to them in such a system. I will probably cover edge cases and some caveats in a later post.

But wait, haven’t we seen this before?

Attempts at creating a decentralized protocol for social media and generally communications is something that many have been attempting over the years. It is at this point I do feel it is important to acknowledge another similar protocol called the ActivityPub protocol in use and developed by Mastodon. In fact, basically everything I will talk about has an analog in that protocol. So why are we talking about ATProto instead of ActivityPub? well to be honest one (ATProto) got more buzz than the other recently. I don’t think any one will be better than the other and it could come down to luck if one fails (or both as happens sometimes in tech). It is just unfortunately how things work. Sometimes one idea gets more publicity and push even if it came later. That being said, there is room for both. So yes, we have seen decentralized content management protocols before.

Mastodon still has some cool people on it. Though I personally don’t use it many people that I meet that give talks at the same conferences within tech use it and swear by it. Worth a mention.

If you are also interested in it, here is the link to the protocol. There is some fantastic engineering behind this as well, and it definitely warrants some reading. I am also constantly looking at it side-by-side as I learn more about ATProto because there are MANY similarities.

ATProto: how Bluesky leverages decentralization

Okay, so this part is going to be slightly technical. I really think there needs to be some mention of how this works at high level, so bear with it.

Bluesky the app is really just a frontend testbed for the protocol I mentioned earlier (ATProto). While it is true that currently most of the servers in use are owned by Bluesky, this is just what I see as a way to bootstrap user comfort around what will ultimately become a decentralized set of many separate servers that can have their content synced in a way that the Bluesky content client can provide the features that it wants to have.

That’s already very technical. But very basically, imagine that you could create content in one place, and it becomes automatically viewable from any other social media platform (i.e. on any other server that talks the same language). That’s what the goal is here. So how do we do that?

Breaking down content management

The developers themselves have confirmed in the website around ATProto that they think of content management as two parts:

  • A management layer (Called the speech layer by ATProto)
    • Made up of data servers and a large Relay that helps know what servers to be aware of
  • A serving layer (Called the reach layer by ATProto)
    • Including an “AppView”, which is what they call data prepared for presentation

The protocol focuses mostly on ensuring ways exist to manage content creation and editing, and removal of content and accounts. The layer that presents content to a user can be implemented however you want and hook into the content streams available from content servers via ATProto. To show the separation of where content sits (the data servers), and where e.g. the content filters and recommendation systems are, here is a simplified diagram of the different layers that exist (note that Presentation is something I added to clarify that the website you use is separate from the other layers).

The key to all of is is that the data servers manage all things around content that can be discovered, transfer of data, and what users actually exist. Everything else talks to those via ATProto.

Identity is owned by users, but managed by data servers

One interesting concept in ATProto is that every user is just a verified piece of data that has a special key tied to it. This is very similar to how blockchain identities and wallets work (and a lot of decentralized identity systems to be honest, but hey, crypto is the world many people get, so I’ll continue to use that example). To be clear this is not blockchain, and all analogies end pretty much here.

Anyway, what’s nice about this is that technically Bluesky could disappear tomorrow, but as long as you maintain access to your special key, your identity can be confirmed anywhere to continue posting without extra verification that can add annoying steps to your migration to a new platform.

That’s the theory anyway, and this adds some extra power to users. That being said, if your content is managed by a service that goes away, well if you don’t have any backups anywhere, that’s that. Luckily they thought about this stuff which is why so much of this connectivity between many data servers is a concept baked into the decentralization plan. Basically, with this protocol it is expected that content you post will be able to be mirrored to other servers.

Again, this is the goal. Unfortunately for now, though, account transfer and content mirroring is not implemented yet according to the current version of the docs as of this writing.

Bluesky is already prepared for adding custom content servers

Looking at some of the ways we can interact with the site, there already is a way to add your own custom server that hosts your content when you sign up on the webapp. (Assuming it follows the protocol of course)

They also already incorporate the domain-based verification of a handle that you can come with on your own. This made it possible for example to bring my own domain into the mix and now I am proudly boasting the Birdy@birdhalfbaked.com handle on the platform. And of course if you don’t know how this stuff is done, they have partnered with Namecheap to make this easier than ever.

Now, remember that I mentioned in a footnote somewhere in the beginning of this section that there is a single relay. I do expect this to some extent to be gatekept as they figure things out. But maybe not and I am wrong, and they just open the gates for anyone to provide content with their own Personal Data Server and no one touches or moderates what servers are connected. This definitely warrants exploration, but more importantly this is the WHOLE POINT behind ATProto and decentralizing social media content management. What a cool time to be alive.

So what’s next?

As I mentioned, after this toe-dipping into some of the gists of what Bluesky is and how it is expected to develop on the ATproto protocol, I will be taking some efforts to learn the nitty-gritty details of the protocol and the operations it supports and perhaps some clarifications of how Bluesky is actually using this to create some more nuanced pictures of how things are organized. In my head I can see a lot of components naturally fitting together, but this would still be an assumption.

To their credit, ALL of this is open sourced, so it just warrants a careful look into some of the open source repos they maintain for both the ATProto specification as well as the code that drives the website. This is so far a very interesting set of developments to be reading about and as a data engineer, there are a lot of concepts I deal with in distributing computing that start buzzing in my head as I read through the ATProto documentation and how they approach certain scale issues.

So with that, see you in the next part: Deeper dive into ATProto, and the challenges of decentralization.