Dyn async traits, part 7: a design emerges?
7 January 2022
Hi all! Welcome to 2022! Towards the end of last year, Tyler Mandry and I were doing a lot of iteration around supporting “dyn async trait” – i.e., making traits that use async fn
dyn safe – and we’re starting to feel pretty good about our design. This is the start of several blog posts talking about where we’re at. In this first post, I’m going to reiterate our goals and give a high-level outline of the design. The next few posts will dive more into the details and the next steps.
The goal: traits with async fn that work “just like normal”
It’s been a while since my last post about dyn trait, so let’s start by reviewing the overall goal: our mission is to allow async fn
to be used in traits just like fn
. For example, we would like to have an async version of the Iterator
trait that looks roughly like this1:
trait AsyncIterator {
type Item;
async fn next(&mut self) -> Self::Item;
}
You should be able to use this AsyncIterator
trait in all the ways you would use any other trait. Naturally, static dispatch and impl Trait
should work:
async fn sum_static(mut v: impl AsyncIterator<Item = u32>) -> u32 {
let mut result = 0;
while let Some(i) = v.next().await {
result += i;
}
result
}
But dynamic dispatch should work too:
async fn sum_dyn(v: &mut dyn AsyncIterator<Item = u32>) -> u32 {
// ^^^
let mut result = 0;
while let Some(i) = v.next().await {
result += i;
}
result
}
Another goal: leave dyn cleaner than we found it
While we started out with the goal of improving async fn
, we’ve also had a general interest in making dyn Trait
more usable overall. There are a few reasons for this. To start, async fn
is itself just sugar for a function that returns impl Trait
, so making async fn
in traits work is equivalent to making RPITIT (“return position impl trait in traits”) work. But also, the existing dyn Trait
design contains a number of limitations that can be pretty frustrating, and so we would like a design that improves as many of those as possible. Currently, our plan lifts the following limitations, so that traits which make use of these features would still be compatible with dyn
:
- Return position
impl Trait
, so long asTrait
is dyn safe.- e.g.,
fn get_widgets(&self) -> impl Iterator<Item = Widget>
- As discussed above, this means that
async fn
works, since it desugars
- e.g.,
- Argument position
impl Trait
, so long asTrait
is dyn safe.- e.g.,
fn process_widgets(&mut self, items: impl Iterator<Item = Widget>)
.
- e.g.,
- By-value self methods.
- e.g., given
fn process(self)
andd: Box<dyn Trait>
, able to calld.process()
- eventually this would be extended to other “box-like” smart pointers
- e.g., given
If you put all three of those together, it represents a pretty large expansion to what dyn safety feels like in Rust. Here is an example trait that would now be dyn safe that uses all of these things together in a natural way:
trait Widget {
async fn augment(&mut self, component: impl Into<WidgetComponent>);
fn components(&self) -> impl Iterator<Item = WidgetComponent>;
async fn transmit(self, factory: impl Factory);
}
Final goal: works without an allocator, too, though you have to work a bit harder
The most straightforward way to support RPITIT is to allocate a Box
to store the return value. Most of the time, this is just fine. But there are use-cases where it’s not a good choice:
- In a kernel, where you would like to use a custom allocator.
- In a tight loop, where the performance cost of an allocation is too high.
- Extreme embedded cases, where you have no allocator at all.
Therefore, we would like to ensure that it is possible to use a trait that uses async fns or RPITIT without requiring an allocator, though we think it’s ok for that to require a bit more work. Here are some alternative strategies one might want to support:
- Pre-allocating stack space: when you create the
dyn Trait
, you reserve some space on the stack to store any futures orimpl Trait
that it might return. - Caching: reuse the same
Box
over and over to reduce the performance impact (a good allocator would do this for you, but not all systems ship with efficient allocators). - Sealed trait: you derive a wrapper enum for just the types that you need.
Ultimately, though, there is no limit to the number of ways that one might manage dynamic dispatch, so the goal is not to have a “built-in” set of strategies but rather allow people to develop their own using procedural macros. We can then offer the most common strategies in utility crates or perhaps even in the stdlib, while also allowing people to develop their own if they have very particular needs.
The design from 22,222 feet
I’ve drawn a little diagram to illustrate how our design works at a high-level:
VtableVtableCallerCallerArgumentadaptation from vtableArgument…Normal function found in the implNormal functi…Return value adaptation to vtableReturn value…Return type adaptation from vtableReturn type a…Caller knows:Types of impl Trait arguments.Caller does not know:Type of the callee.Precise return type, if function returns impl Trait.Caller knows:…Argument adaptation to vtableArgument adap…Callee does not know:Types of impl Trait arguments.Callee knows:Type of the callee.Precise return type, if function returns impl Trait.Callee does not know:…Viewer does not support full SVG 1.1
Let’s walk through it:
- To start, we have the caller, which has access to some kind of
dyn
trait, such asw: &mut Widget
, and wishes to call a method, likew.augment()
- The caller looks up the function for
augment
in the vtable and calls it:- But wait, augment takes a
impl Into<WidgetComponent>
, which means that it is a generic function. Normally, we would have a separate copy of this function for everyInto
type! But we must have only a single copy for the vtable! What do we do? - The answer is that the vtable encodes a copy that expects “some kind of pointer to a
dyn Into<WidgetComponent>
”. This could be aBox
but it could also be other kinds of pointers: I’m being hand-wavy for now, I’ll go into the details later. - The caller therefore has the job of creating a “pointer to a
dyn Into<WidgetComponent>
”. It can do this because it knows the type of the value being provided; in this case, it would do it by allocating some memory space on the stack.
- But wait, augment takes a
- The vtable, meanwhile, includes a pointer to the right function to call. But it’s not a direct pointer to the function from the impl: it’s a lightweight shim that wraps that function. This shim has the job of converting from the vtable’s ABI into the standard ABI used for static dispatch.
- When the function returns, meanwhile, it is giving back some kind of future. The callee knows that type, but the caller doesn’t. Therefore, the callee has the job of converting it to “some kind of pointer to a
dyn Future
” and returning that pointer to the caller.- The default is to box it, but the callee can customize this to use other strategies.
- The caller gets back its “pointer to a
dyn Future
” and is able to await that, even though it doesn’t know exactly what sort of future it is.
Upcoming posts
In upcoming blog posts, I’m going to expand on several things that I alluded to in my walkthrough:
- “Pointer to a
dyn Trait
”:- How exactly do we encode “some kind of pointer” and what does that mean?
- This is really key, because we need to be able to support
- Adaptation for
impl Trait
arguments:- How do we adapt to/from the vtable for arguments of generic type?
- Hint: it involves create a
dyn Trait
for the argument
- Adaptation for impl trait return values:
- How do we adapt to/from the vtable for arguments of generic type?
- Hint: it involves returning a
dyn Trait
, potentially boxed but not necessarily
- Adaptation for by-value self:
- How do we adapt to/from the vtable for by-value self, and when are such functions callable?
- Boxing and alternatives thereto:
- When you call an async fn or fn that returns
impl Trait
via dynamic dispatch, the default behavior is going to allocate aBox
, but we’ve seen that doesn’t work for everyone. How convenient can we make it to select an alternative strategy like stack pre-allocation, and how can people create their own strategies?
- When you call an async fn or fn that returns
We’ll also be updating the async fundamentals initiative page with more detailed design docs.
Appendix: Things I’d still like to see
I’m pretty excited about where we’re landing in this round of work, but it doesn’t get dyn
where I ultimately want it to be. My ultimate goal is that people are able to use dynamic dispatch as conveniently as you use impl Trait
, but I’m not entirely sure how to get there. That means being able to write function signatures that don’t talk about Box
vs &
or other details that you don’t have to deal with when you talk about impl Trait
. It also means not having to worry so much about Send/Sync
and lifetimes.
Here are some of the improvements I would like to see, if we can figure out how:
- Support clone:
- Given trait
Widget: Clone
andw: Box<dyn Widget>
, able to invokew.clone()
- This almost works, but the fact that
trait Clone: Sized
makes it difficult.
- Given trait
- Support “partially dyn safe” traits:
- Right now, dyn safe is all or nothing. This has the nice implication that
dyn Foo: Foo
for all types. However, it is also limiting, and many people have told me they find it confusing. Moreover,dyn Foo
is notSized
, and hence while it’s cool conceptually thatdyn Foo
implementsFoo
, you can’t actually use adyn Foo
in the same way that you would use most other types.
- Right now, dyn safe is all or nothing. This has the nice implication that
- Improve how
Send
interacts with returned values (e.g., RPIT, async fn in traits, etc):- If you write
dyn Foo + Send
, that
- If you write
- Avoid having to talk about pointers so much
- When you use
impl Trait
, you get a really ergonomic experience today:fn apply_map(map_fn: impl FnMut(u32) -> u32)
fn items(&self) -> impl Iterator<Item = Item> + '_
- In contrast, when you use dyn trait, you wind up having to be very explicit around lots of details, and your callers have to change as well:
fn apply_map(map_fn: &mut dyn FnMut(u32) -> u32)
fn items(&self) -> Box<dyn Iterator<Item = Item> + '_>
- When you use
- Make dyn trait feel more parametric:
- If I have an
struct Foo<T: Trait> { t: Box<T> }
, it has the nice property that it exposes theT
. This means we know thatFoo<T>: Send
ifT: Send
(assumingFoo
doesn’t have any fields that are not send), we know thatFoo<T>: 'static
ifT: 'static
, and so forth. This is very cool. - In contrast,
struct Foo { t: Box<dyn Trait> }
bakes a lot of details – it doesn’t permitt
to contain any references, and it doesn’t letFoo
beSend
.
- If I have an
- Make it sound:
- There are a few open soundness bugs around dyn trait, such as #57893, and I would like to close them. This interacts with other things in this list.
This has traditionally been called
Stream
. ↩︎