How to make atproto actually easy
Jacquard is a Rust library, or rather a suite of libraries, intended to make it much simpler to get started with atproto development, without sacrificing flexibility or performance. How it does that is relatively clever, and I think benefits from some explaining, because it doesn't really come across in descriptions like "a better Rust atproto library, with much less boilerplate". Descriptions like those especially don't really communicate that Jacquard is not simpler because someone wrote all the code for you, or had Claude do it. Jacquard is simpler because it is designed in a way which makes things simple that almost every other atproto library seems to make difficult.
The Jacquard machine was one of the earliest devices you might call "programmable" in the sense we normally mean, allowing a series of punched cards to automatically control a mechanical weaving loom.
First, let's talk boilerplate. An extremely common thing for people writing code for atproto to have to do is to write friendly helper methods over API bindings generated from lexicons. In the official Bluesky Typescript library you get a couple of layers of **Agent
wrapper classes which provide convenient helpers for common methods, mostly hand-written, because the autogenerated API bindings are verbose to call and don't necessarily handle all the eventualities. There is a lot of code dedicated to handling updates to Bluesky preferences. Among the worst for required boilerplate is ATrium, the most widely-used set of Rust libraries for atproto, which mirrors the Typescript SDK in many ways, not all good. This results in pretty much anyone using ATrium needing to implement their own more ergonomic helpers, and often reimplementing chunks of the library for things like session management (particularly if they want to use their own lexicons), because certain important internal types aren't exported. This is boilerplate, and while LLMs are often pretty good at doing that for you these days, it still clutters your codebase.
The problem with needing handwritten helpers to do things conveniently is that when you venture off the beaten path you end up needing to reinvent the wheel a lot. This is a big barrier for people looking to "just do things" on atproto. You need to figure out OAuth, you need to write all those convenience functions, etc. especially if you're working with your own lexicons rather than just using Bluesky's.
There are other libraries which handle some of these things better, but nothing (especially not in Rust) which got all the way there in a way that fit how I like to work, and how I think a lot of other Rust developers would like to work. Jacquard is the answer to the question a lot of my Rust atproto developer friends were asking.
Here's the canonical example. Compare to the ATrium Bluesky SDK example, which doesn't handle OAuth. There are some convenient helpers used here to elide OAuth setup stuff (helpers which ATrium's OAuth implementation lacks) but even without those, it's not that verbose, and the actual main action, fetching the timeline, is simply calling a single function with a generated API struct, then handling the result. Nothing here is Bluesky-specific that wasn't generated in seconds by Jacquard's lexicon API code generation.
#[tokio::main]
async fn main() -> miette::Result<()> {
let args = Args::parse();
// Build an OAuth client with file-backed auth store and default localhost config
let oauth = OAuthClient::with_default_config(FileAuthStore::new(&args.store));
// Authenticate with a PDS, using a loopback server to handle the callback flow
let session = oauth
.login_with_local_server(
args.input.clone(),
Default::default(),
LoopbackConfig::default(),
)
.await?;
// Wrap in Agent and fetch the timeline
let agent: Agent<_> = Agent::from(session);
let timeline = agent
.send(GetTimeline::new().limit(5).build())
.await?
.into_output()?;
for (i, post) in timeline.feed.iter().enumerate() {
println!("\n{}. by @{}", i + 1, post.post.author.handle);
println!(
" {}",
serde_json::to_string_pretty(&post.post.record).into_diagnostic()?
);
}
Ok(())
}
Just .send()
it
Jacquard has a couple of .send()
methods. One is stateless. it's the output of a method that creates a request builder, implemented as an extension trait, XrpcExt
, on any http client which implements a very simple HttpClient trait. You can use a bare reqwest::Client
to make XRPC requests. You call .xrpc(base_url)
and get an XrpcCall
struct. XrpcCall
is a builder, which allows you to pass authentication, atproto proxy settings, labeler headings, and set other options for the final request. There's also a similar trait DpopExt
in the jacquard-oauth
crate, which handles that form of authenticated request in a similar way. For basic stuff, this works great, and it's a useful building block for more complex logic, or when one size does not in fact fit all.
use jacquard_common::xrpc::XrpcExt;
use jacquard_common::http_client::HttpClient;
/// ...
let http = reqwest::Client::new();
let base = url::Url::parse("https://public.api.bsky.app")?;
let resp = http.xrpc(base).send(&request).await?;
The other, XrpcClient
, is stateful, and can be implemented on anything with a bit of internal state to store the base URI (the URL of the PDS being contacted) and the default options. It's the one you're most likely to interact with doing normal atproto API client stuff. The Agent struct in the initial example implements that trait, as does the session struct it wraps, and the .send()
method used is that trait method.
XrpcClient
implementers don't have to implement token auto-refresh and so on, but realistically they should implement at least a basic version. There is anAgentSession
trait which does require full session/state management.
Here is the entire text of XrpcCall::send()
. build_http_request()
and process_response()
are public functions and can be used in other crates. The first does more or less what it says on the tin. The second does less than you might think. It mostly surfaces authentication errors at an earlier level so you don't have to fully parse the response to know if there was an error or not.
pub async fn send<R>(
self,
request: &R,
) -> XrpcResult<Response<<R as XrpcRequest<'s>>::Response>>
where
R: XrpcRequest,
{
let http_request = build_http_request(&self.base, request, &self.opts)
.map_err(TransportError::from)?;
let http_response = self
.client
.send_http(http_request)
.await
.map_err(|e| TransportError::Other(Box::new(e)))?;
process_response(http_response)
}
A core goal of Jacquard is to not only provide an easy interface to atproto, but to also make it very easy to build something that fits your needs, and making "helper" functions like those part of the API surface is a big part of that, as are "stateless" implementations like
XrpcExt
andXrpcCall
.
.send()
works for any endpoint and any type that implements the required traits, regardless of what crate it's defined in. There's no KnownRecords
enum which defines a complete set of known records, and no restriction of Service endpoints in the agent/client, or anything like that, nothing that privileges any set of lexicons or way of working with the library, as much as possible. There's one primary method and you can put pretty much anything relevant into it. Whatever atproto API you need to call, just .send()
it. Okay there are a couple of additional helpers, but we're focusing on the core one, because pretty much everything else is just wrapping the above send()
in one way or another, and they use the same pattern.
Punchcard Instructions
So how does this work? How does send()
and its helper functions know what to do? The answer shouldn't be surprising to anyone familiar with Rust. It's traits! Specifically, the following traits, which have generated implementations for every lexicon type ingested by Jacquard's API code generation, but which honestly aren't hard to just implement yourself (more tedious than anything). XrpcResp is always implemented on a unit/marker struct with no fields. They provide all the request-specific instructions to the functions.
pub trait XrpcRequest: Serialize {
const NSID: &'static str;
/// XRPC method (query/GET or procedure/POST)
const METHOD: XrpcMethod;
type Response: XrpcResp;
/// Encode the request body for procedures.
fn encode_body(&self) -> Result<Vec<u8>, EncodeError> {
Ok(serde_json::to_vec(self)?)
}
/// Decode the request body for procedures. (Used server-side)
fn decode_body<'de>(body: &'de [u8]) -> Result<Box<Self>, DecodeError>
where
Self: Deserialize<'de>
{
let body: Self = serde_json::from_slice(body).map_err(|e| DecodeError::Json(e))?;
Ok(Box::new(body))
}
}
pub trait XrpcResp {
const NSID: &'static str;
/// Output encoding (MIME type)
const ENCODING: &'static str;
type Output<'de>: Deserialize<'de> + IntoStatic;
type Err<'de>: Error + Deserialize<'de> + IntoStatic;
}
Here are the implementations for GetTimeline
. You'll also note that send()
doesn't return the fully decoded response on success. It returns a Response struct which has a generic parameter that must implement the XrpcResp trait above. Here's its definition. It's essentially just a cheaply cloneable byte buffer and a type marker.
pub struct Response<R: XrpcResp> {
buffer: Bytes,
status: StatusCode,
_marker: PhantomData<R>,
}
impl<R: XrpcResp> Response<R> {
pub fn parse<'s>(
&'s self
) -> Result<<Resp as XrpcResp>::Output<'s>, XrpcError<<Resp as XrpcResp>::Err<'s>>> {
// Borrowed parsing into Output or Err
}
pub fn into_output(
self
) -> Result<<Resp as XrpcResp>::Output<'static>, XrpcError<<Resp as XrpcResp>::Err<'static>>>
where ...
{ /* Owned parsing into Output or Err */ }
}
You decode the response (or the endpoint-specific error) out of this, borrowing from the buffer or taking ownership so you can drop the buffer. There are two reasons for this. One is separation of concerns. By two-staging the parsing, it's easier to distinguish network and authentication problems from application-level errors. The second is lifetimes and borrowed deserialization. This is a bit of a long, technical aside, so if you want to jump over it, skip down to "So What?"
Working with Lifetimes and Zero-Copy Deserialization
Jacquard is designed around zero-copy/borrowed deserialization: types like Post<'a>
can borrow strings and other data directly from the response buffer instead of allocating owned copies. This is great for performance, but it creates some interesting challenges, especially in async contexts. So how do you specify the lifetime of the borrow?
The naive approach would be to put a lifetime parameter on the trait itself:
trait NaiveXrpcRequest<'de> {
type Output: Deserialize<'de>;
// ...
}
This looks reasonable until you try to use it in a generic context. If you have a function that works with any lifetime, you need a Higher-ranked trait bound:
fn parse<R>(response: &[u8]) ... // return type
where
R: for<'any> XrpcRequest<'any>
{ /* deserialize from response... */ }
The for<'any>
bound says "this type must implement XrpcRequest
for every possible lifetime", which, for Deserialize
, is effectively the same as requiring DeserializeOwned
. You've probably just thrown away your zero-copy optimization, and furthermore that trait bound just straight-up won't work on most of the types in Jacquard. The vast majority of them have either a custom Deserialize implementation which will borrow if it can, a #[serde(borrow)]
attribute on one or more fields, or an equivalent lifetime bound attribute, associated with the Deserialize derive macro. You will get "Deserialize implementation not general enough" if you try. And no, you cannot have an additional deserialize implementation for the 'static
lifetime due to how serde works.
If you instead try something like the below function signature and specify a specific lifetime, it will compile in isolation, but when you go to use it, the Rust compiler will not generally be able to figure out the lifetimes at the call site, and will complain about things being dropped while still borrowed, even if you convert the response to an owned/ 'static
lifetime version of the type.
fn parse<'s, R: XrpcRequest<'s>>(response: &'s [u8]) ... // return type with the same lifetime
{ /* deserialize from response... */ }
It gets worse with async. If you want to return borrowed data from an async method, where does the lifetime come from? The response buffer needs to outlive the borrow, but the buffer is consumed or potentially has to have an unbounded lifetime. You end up with confusing and frustrating errors because the compiler can't prove the buffer will stay alive or that you have taken ownership of the parts of it you care about. And even if you don't return borrowed data, holding anything across an await point makes determining bounds for things like the Send autotrait (important if you're working with crates like Axum) impossible for the compiler. You could do some lifetime laundering with unsafe
, but that road leads to potential soundness issues, and besides, you don't actually need to tell rustc
to "trust me, bro", you can, with some cleverness, explain this to the compiler in a way that it can reason about perfectly well.
Explaining where the buffer goes to rustc
The fix is to use Generic Associated Types (GATs) on the trait's associated types, while keeping the trait itself lifetime-free:
pub trait XrpcResp {
const NSID: &'static str;
/// Output encoding (MIME type)
const ENCODING: &'static str;
type Output<'de>: Deserialize<'de> + IntoStatic;
type Err<'de>: Error + Deserialize<'de> + IntoStatic;
}
Now you can write trait bounds without HRTBs, and with lifetime bounds that are actually possible for Jacquard's borrowed deserializing types to meet:
fn parse<'s, R: XrpcResp>(response: &'s [u8]) /* return type with same lifetime */ {
// Compiler can pick a concrete lifetime for R::Output<'_> or have it specified easily
}
Methods that need lifetimes use method-level generic parameters:
// This is part of a trait from jacquard itself, used to genericize updates to things like the Bluesky
// preferences union, so that if you implement a similar lexicon type in your app, you don't have
// to special-case it. Instead you can do a relatively simple trait implementation and then call
// .update_vec() with a modifier function or .update_vec_item() with a single item you want to set.
pub trait VecUpdate {
type GetRequest: XrpcRequest;
type PutRequest: XrpcRequest;
// ... more stuff
// Method-level lifetime, not trait-level
fn extract_vec<'s>(
output: <Self::GetRequest<'s> as XrpcRequest<'s>>::Output<'s>
) -> Vec<Self::Item>;
// ... more stuff
}
The compiler can monomorphize for concrete lifetimes instead of trying to prove bounds hold for all lifetimes at once, or struggle to figure out when you're done with a buffer. XrpcResp
being separate and lifetime-free lets async methods like .send()
return a Response
that owns the response buffer, and then the caller decides the lifetime strategy:
// Zero-copy: borrow from the owned buffer
let output: R::Output<'_> = response.parse()?;
// Owned: convert to 'static via IntoStatic
let output: R::Output<'static> = response.into_output()?;
The async method doesn't need to know or care about lifetimes for the most part - it just returns the Response
. The caller gets full control over whether to use borrowed or owned data. It can even decide after the fact that it doesn't want to parse out the API response type that it asked for. Instead it can call .parse_data()
or .parse_raw()
on the response to get loosely typed, validated data or minimally typed maximally accepting data values out.
So what?
Well, most importantly, what this means is that people using Jacquard have to write a lot less code, and I developing Jacquard also have to write a lot less code to support a wide variety of use cases. Jacquard's code generation handles all the trait implementation housekeeping and marker structs for jacquard-api
and for the most part you can just use the generated stuff as is. It also means that even if you don't care about zero-copy deserialization or strong typing and just want things to be easy, things are in fact easy. Just put 'static
for your lifetime bounds on potentially borrowed Jacquard types, derive IntoStatic
and call .into_static()
to take ownership if needed, and forget about it. Use atproto string types like they're strings. Use loosely typed data values that actually know about atproto primitives like at://
uris or DIDs, handles, CIDs or blobs rather than just serde_json::Value
or ipld_core::ipld::Ipld
. And if you're working with posts from, for example, Bridgy Fed, which injects extra fields which aren't in the official Bluesky lexicon that carry the original ActivityPub data into federated Mastodon posts, you can access those fields easily via the extra_data
field that the #[lexicon]
attribute macro adds to record types.
So yeah. If you're writing atproto stuff in Rust, and you don't need stuff that's not implemented yet (like moderation filtering and easy service auth), consider using Jacquard. It's pretty cool. I just released version 0.5.0, which has a number of nice additions and improves the documentation a fair bit. There are a number of examples in the Tangled repository.
And if you got this far and like the library, I do accept sponsorships on GitHub.