Exploring Traits with Erased ‘serde’
I came across a programming problem recently where I wanted to use
dynamic polymorphism with serde
. This turned out to be much easier
than I expected, and I thought it was an interesting enough case
study to share, especially for people who are learning Rust.
A Brief Discussion of Polymorphism in Rust#
As most of you will know, Rust’s system for polymorphism – trait
s
– supports both static and dynamic polymorphism, with a bias towards
static polymorphism.
For static polymorphism, it uses the impl
keyword, or
alternatively, a syntax called “trait bounds” reminiscent of
C++. It is implemented through “monomorphization,” which
involves making on-demand copies of any polymorphic functions
at compile-time. And it is the default way to use polymorphism
in idiomatic Rust, as evidenced by the fact that it comes earlier
in the Rust book.
Dynamic polymorphism, in contrast, uses the dyn
keyword to create
“trait objects.” This is implemented through vtables, which are also
how C++ implements OOP-style polymorphism. Even though it is more
of an OOP-style feature, and therefore more familiar to programmers
with an OOP background, in Rust it is less commonly used. This
is evidenced by the fact that it is introduced later in the Rust
book
with a much narrower use case in
a chapter that
encourages a programmer to “implement[] a solution using some of Rust’s
strengths instead.”
The biggest reason dynamic polymorphism is not one of “Rust’s strenghts” is that only object-safe traits can be used with dynamic polymorphism, due to the technical limitations of vtables. Whether a trait is “object traits” is defined by whether it meets a long list of criteria, which generally get more liberal over time as people agree on how to address technical limitations, but fundamentally only some traits can be used with vtables. Additionally, dynamic polymorphism also adds a performance cost, due to indirect calls and less optimization opportunities.
The biggest reason to use dynamic polymorphism in spite of these issues
is when an “object” needs to take on a range of possible values at
run-time that can’t be expressed in an enum
, because other code has
to be able to expand the list. As the Rust book points out, this comes
up especially often in GUI programming, where the GUI framework has no
way to enumerate every possible widget and know how to draw
it or how
it should handle events.
My Situation#
I’m not currently a GUI programmer and I rarely use dynamic polymorphism. My recent experience before Rust was with Haskell and C++ template programming, and both of those are more similar in style to Rust’s static polymorphism.
But it still occasionally comes up.
Step 0: A Normal serde
Use Case#
So here was the situation: I had a data structure that I was serializing into JSON so I could send the JSON over TCP. For the sake of the blog post, let’s pretend I was sending reports on groceries as an extremely contrived example:
pub enum MeatStatus {
Veg,
Fish,
Meat,
}
pub struct CustomerId(pub u64);
pub struct GroceryItem {
pub description: String,
pub customer_id: CustomerId,
pub price_in_cents: u64,
pub calories: f64,
pub grams_protein: f64,
pub grams_carbs: f64,
pub grams_fat: f64,
pub grams_alcohol: f64,
pub meat_status: MeatStatus,
pub halal: bool,
pub kosher: bool,
}
Now, I not only wanted to send this data out on the wire, but I also wanted to aggregate it. How many calories was each customer buying, total? How many customers were vegetarian, pescetarian, or religiously observant?
So I needed to pass this data structure around once I got it from the cash register (thank you for bearing with this silly example), and then after extracting some data from it, send it over the wire.
Well, Rust makes this sort of thing easy: “There’s a crate for that.”
In this case, it’s serde
, which lets you annotate
data structures for serialization into JSON and other formats. A simple
call to a derive
macro makes it implement the serde
Serialize
trait:
#[derive(Serialize)]
pub enum MeatStatus
...
#[derive(Serialize)]
pub struct CustomerId(pub u64);
#[derive(Serialize)]
pub struct GroceryItem {
...
So far, very easy and boring (though we should probably take more time
to appreciate just how amazing serde
is, which I will someday write
more about in a dedicated blog post).
I then collect the data from the cash register with a function that looks like this, as the cash register has a completely different trait-dependent notion of the food, which is still a static trait because … each cash register is only for one general category of food, because … it’s actually a farmer’s market (I’m good at examples!):
fn extract_grocery_data<T: FarmersMarketStand>(
customer_id: CustomerId,
item: &T::Item,
)-> Result<GroceryItem> {
Ok(GroceryItem {
description: item.read_description()?,
customer_id,
calories: item.calculate_calories()?,
...
})
}
Each farmer’s market stand has its own Item
type, and the data
from each is extracted and put into this generic structure, so that
I can both process it and send it over the wire. Easy enough!
Step 1: A New Requirement#
I thought this code was well-structured and well-architected, and patted myself on the back for it! But, as any experienced programmer knows, the true test of a software architecture is when you get a new requirement.
It’s when you get a new requirement (including “fix this bug we found”) that you actually learn if you did a good job with the architecture. It’s the only objective measure. If you built flexibility in, did it have anything to do with the new set of requirements? If not, it might have been over-engineered. Did you make any decisions that made it unnecessarily inflexible? If so, it might have been poorly engineered. Can you still even read the code so you can change it? Do you know exactly where the change fits? Are you tempted to throw the code out and rewrite it from scratch? Can you even still run it on your machine?
I digress.
The new requirement was quite simple: the farmers wanted us to pipe some
data back to them from the grocery items. They were already connecting
to the TCP stream, but the data we were using to aggregate wasn’t enough.
We had to convey more information in the JSON, and unfortunately, this
information was FarmersMarketStand
specific.
Now, we had to add an additional field to our data structure. But what type should it be? I don’t need to use it for analytics, unlike the other fields. I just need to get it to the TCP connection so the farmers can get it right back:
pub struct GroceryItem {
...
pub halal: bool,
pub kosher: bool,
pub market_specific_data: ???
}
Now, if I want to use static polymorphism, I have to add a type parameter
to GroceryItem
:
pub struct GroceryItem<T: Serialize> {
...
pub market_specific_data: T,
}
But if I do this, I have to keep on parameterizing all my functions
after this on this new type parameter. Besides, this would mean that I can’t
send all the GroceryItem
s through a single channel; I have to have
a separate channel per FarmersMarketStand
. Maybe I could figure it out,
but I don’t feel like I should have to, and besides, I’m trying not to
have to rearchitecture half the program.
An alternative prospect is serializing the data
first, since the only thing I’m going to do with it
is serialize it. Then, I can store it in a serialized
form. serde_json
,
which implements serde
support for JSON,
has a type and a function just for this purpose:
serde_json::Value
and
serde_json::to_value
.
That gives us something like this:
pub struct GroceryItem {
...
pub market_specific_data: serde_json::Value,
}
fn extract_grocery_data<T: FarmersMarketStand>(
customer_id: CustomerId,
item: &T::Item,
)-> Result<GroceryItem> {
let market_specific_data = item.read_market_specific_data()?;
let market_specific_data = serde_json::to_value(&market_specific_data);
...
The problem here is, the farmers only connect to the TCP connection maybe 10% of the time, and the rest of the time, I don’t want to pay the extra cost of serialization. Plus, I don’t want to pay the cost of serializing to this intermedaite format, and then to JSON, rather than serializing directly to JSON.
Step 2: Dynamic Polymorphism#
Now, you might be having an idea right now: Why not use dynamic polymorphism? This way we can have a little blob that means “I know how to serialize myself,” but we only have to do the serialization if it actually comes up. We don’t have to know anything else about the blob, nor do we have to pass the type all over the place at compile-time with all the baggage that comes with that.
So you write something like this:
pub struct GroceryItem {
...
pub market_specific_data: Box<dyn Serialize>,
}
… and you find out that Serialize
is not
object-safe. You look up the docs for the Serialize
trait,
and lo and behold! It’s got one method:
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
Well, why isn’t this object-safe? Well, at a Rust level it’s one
method, but it’s a method uses static polymorphism. At a Rust level,
we might think we just need to store what method to call at run-time,
but actually, by the time we get to run-time, this isn’t a single
method anymore. It will have been monomorphized into a method
per every possible value of S
, every possible serializer.
Now, we’re only using the JSON serializer, but there’s no way for the method to know that. To make a vtable for this method, Rust would have to write down an implementation of this method for every possible serializer, which is too many and not a well-defined set.
OK, well, you might think, why not take advantage of the fact that we’re just using the JSON serializer? Why not write this:
trait JsonSerialize {
fn json_serialize(
&self,
serializer: serde_json::Serializer,
) -> Result<
serde_json::Serializer::Ok,
serde_json::Serializer::Error,
>;
}
This trait is like Serialize
, but because it no longer uses
static polymorphism, it’s now object-safe. Only one time method
is needed per implementing type.
Well, how do we implement this trait? Serialize
has a derive
macro, but JsonSerialize
does not. However, a type’s
JsonSerialize
implementation could just call the Serialize
implementation. And rather than making every farmer at the market do
this for their own type, we can use a blanket implementation
that says if a value is Serialize
, it’s also JsonSerialize
:
impl<T> JsonSerialize for T where T: Serialize {
fn json_serialize(
&self,
serializer: serde_json::Serializer,
) -> Result<
serde_json::Serializer::Ok,
serde_json::Serializer::Error,
> {
self.serialize(serializer);
}
}
So we can have all the trait implementations for the object-safe
trait be implemented using static polymorphism based on the non-object-safe
trait. This is a common pattern and it’s known as type erasure,
because you’ve erased all the <T: Serialize>
you would otherwise
need everywhere you mentioned the GroceryItem
type.
However, this isn’t very good, because we want to use this as part of a serializable structure:
#[derive(Serialize)]
pub struct GroceryItem {
...
pub market_specific_data: Box<dyn JsonSerialize>,
}
See, when the Serialize
derive macro gets to the market_specific_data
field, it doesn’t implement Serialize
. It just implements JsonSerialize
,
since that’s how we made it object-safe. However, it’s trying to implement
Serialize
on GroceryItem
– for all serializers, and it’s never
heard of JsonSerialize
.
Step 3: There’s a crate for that!#
At this point, I thought: There’s got to be a way to entirely
type-erase Serialize
. The problem with the method in
Serialize
is that it’s passed in a statically polymorphic
Serializer
– but what if we type-erased Serializer
? The
problem with that is Serializer
has like a bajillion
methods,
so we’d have to deal with all of them in our type-erased
version.
My conclusion? It’s possible, but it’d be a lot of work, so much that it might well be its own crate. And when you have that thought, well, one possibility is that crate may already exist.
And lo and behold, it does! Allow me to introduce the excellent
erased-serde
by David Tolnay. It does all of the
work of type erasure for all of serde
, and if you’re new to
type erasure, the code is worth a read. It even uses macros!
It called its type-erased trait Serialize
, which layered on top of
the non-type erased trait, called Serialize
. If your type implemented
Serialize
, it automatically implemented Serialize
due to a blanket
implementation, which was great, because then you could write Box<dyn Serialize>
, and would you know that dyn Serialize
also had an
implementation for Serialize
already done?
use erased_serde::Serialize as ErasedSerialize;
I mean to say: If your type implemented Serialize
, it automatically
implemented ErasedSerialize
due to a blanket implementation, which
was great, because then you could write Box<dyn ErasedSerialize>
, and
would you know that dyn ErasedSerialize
also had an implementation for
Serialize
already done?
This meant, all in all, that I could write this:
#[derive(Serialize)]
pub struct GroceryItem {
...
pub market_specific_data: Box<dyn ErasedSerialize>,
}
fn extract_grocery_data<T: FarmersMarketStand>(
customer_id: CustomerId,
item: &T::Item,
)-> Result<GroceryItem> {
Ok(GroceryItem {
description: item.read_description()?,
customer_id,
calories: item.calculate_calories()?,
...
market_specific_data: Box::new(item.read_market_specific_data()?),
})
}
The cast from Box<impl Serialize>
to Box<dyn ErasedSerialize>
is
implicit, and Box<dyn ErasedSerialize>
implements Serialize
, so
the derive
macro is happy!
Voilà!
The code is available in a GitHub repo and the output shows the power of Rust polymorphism:
[jim@palatinate:~/hobby/groceries]$ cargo run | jq .
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/groceries`
[
{
"description": "Apples",
"customer_id": 0,
"price_in_cents": 3,
"calories": 10,
"grams_protein": 10,
"grams_carbs": 10,
"grams_fat": 10,
"grams_alcohol": 10,
"meat_status": "Veg",
"halal": true,
"kosher": true,
"market_specific_data": {
"variety": "Gala",
"doctors_kept_away": 30
}
},
{
"description": "Bacon",
"customer_id": 1,
"price_in_cents": 3000,
"calories": 10,
"grams_protein": 10,
"grams_carbs": 10,
"grams_fat": 10,
"grams_alcohol": 10,
"meat_status": "Meat",
"halal": false,
"kosher": false,
"market_specific_data": {
"farm_of_origin": "Stolzfus and Sons",
"breakfasts_served": 15
}
}
]
Step 4: Bonus Round: Another Requirement#
Does this work well? Let’s see how a new requirement can be dealt with!
So next I learn that I have to implement Clone
on GroceryItem
,
for some of the processing code where we do the data metrics.
I might think, well, this should be easy! I have a Box
, and
I never write to the inner value, so I just need a cloneable Box
,
an Arc
. Then, I can #[derive(Clone)]
, and the market_specific_data
field will just be multiply-owned.
But, alas, no! This error appears:
error[E0277]: the trait bound `Arc<dyn erased_serde::Serialize>: _::_serde::Serialize` is not satisfied
Why does this work for Box<dyn ErasedSerialize>
and not
Arc<dyn ErasedSerialize>
? Well, this is actually quite
straight-forward: There is an implementation of Serialize
for
Box<T>
when T
implements Serialize
, part of the Serialize
crate. It does
not exist for Arc
.
I know that I can’t do the same in my own crate, but for Arc
instead
of Box
:
impl<T> Serialize for Arc<T>
where
T: Serialize,
{
#[inline]
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
(**self).serialize(serializer)
}
}
– because that would violate the dreaded “orphan rule”:
error[E0117]: only traits defined in the current crate can be implemented for types defined outside of the crate
--> src/main.rs:191:1
|
191 | impl<T> Serialize for Arc<T>
| ^ ------ `Arc` is not defined in the current crate
| _|
| |
192 | | where
193 | | T: Serialize,
194 | | {
... |
201 | | }
202 | | }
| |_^ impl doesn't use only types from inside the current crate
|
= note: define and implement a trait or new type instead
But if we know the orphan rule well, or just read the note in the error message, we know that we can get around it with… you guessed it, a newtype!
Newtypes are named after the Haskell keyword newtype
, though in Rust
they don’t use that keyword, so we refer to the “newtype pattern.” In both
Haskell and Rust, they’re the standard way to get around the orphan rule.
The premise is simple: We define a new type that is distinct to the
compiler (so we can’t use type
) but not practically distinct. It’s
generally implemented in Rust as a tuple-struct
with one field.
There’s two ways to go with this, as this blog
post
indicates (which surprisingly enough is also about serde
!). We can
try and fix the Arc<T>
problem for everybody with a generic newtype,
or just for ourselves with a regular ol' newtype.
Here’s how the regular newtype solution looks:
#[derive(Clone)]
pub struct MarketSpecificData(Arc<dyn ErasedSerialize>);
impl Serialize for MarketSpecificData {
#[inline]
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
(*self.0).serialize(serializer)
}
}
#[derive(Serialize, Clone)]
pub struct GroceryItem {
pub description: String,
pub customer_id: CustomerId,
...
pub kosher: bool,
pub market_specific_data: MarketSpecificData,
}
If your takeaway at this point is that writing trait
-heavy code
involves a lot of functions that call other functions with the same
name and almost the same arguments, you’re not wrong.
However, in this case it was all unnecessary, as it turns out that
we can get support for Arc<T>
from serde
itself if we enable
the rc
feature:
[jim@palatinate:~/hobby/groceries-contrived-example]$ cargo add --features rc,derive serde
Updating crates.io index
Adding serde v1.0.143 to dependencies.
Features:
+ derive
+ rc
+ serde_derive
+ std
- alloc
- unstable
Both versions are available in the example repo’s clone
branch.
Newsletter
Find out via e-mail when I make new posts!