Async Interview #3: Carl Lerche

23 December 2019

Hello! For the latest async interview, I spoke with Carl Lerche (carllerche). Among many other crates¹, Carl is perhaps best known as one of the key authors behind tokio and mio. These two crates are quite widely used through the async ecosystem. Carl and I spoke on December 3rd.

Video

You can watch the video on YouTube. I’ve also embedded a copy here for your convenience:

Background: the mio crate

One of the first things we talked about was a kind of overview of the layers of the “tokio-based async stack”.

We started with the mio crate. mio is meant to be the “lightest possible” non-blocking I/O layer for Rust. It basically exposes the “epoll” interface that is widely used on linux. Windows uses a fundamentally different model, so in that case there is a kind of compatibility layer, and hence the performance isn’t quite as good, but it’s still pretty decent. mio “does the best it can”, as Carl put it.

The tokio crate builds on mio. It wraps the epoll interface and exposes it via the Future abstraction from std. It also offers other things that people commonly need, such as timers.

Finally, bulding atop tokio you find tower, which exposes a “request-response” abstraction called Service. tower is similar to things like finagle or rack. This is then used by libraries like hyper and tonic, which implement protocol servers (http for hyper, gRPC for tonic). These protocol servers internally use the tower abstractions as well, so you can tell hyper to execute any Service.

One challenge is that it is not yet clear how to adapt tower’s Service trait to std::Future. It would really benefit from support of async functions in traits, in particular, which is difficult for a lot of reasons. The current plan is to adopt Pin and to require boxing and dyn Future values if you wish to use the async fn sugar. (Which seems like a good starting place, -ed.)

Returning to the overall async stack, atop protocol servers like hyper, you find web frameworks, such as warp – and (finally) within those you have middleware and the actual applications.

How independent are these various layers?

I was curious to understand how “interconnected” these various crates were. After all, while tokio is widely used, there are a number of different executors out there, both targeting different platforms (e.g., Fuchsia) as well as different trade-offs (e.g., async-std). I’m really interested to get a better understanding of what we can do to help the various layers described above operate independently, so that people can mix-and-match.

To that end, I asked Carl what it would take to use (say) Warp on Fuchsia. The answer was that “in principle” the point of Tower is to create just such a decoupling, but in practice it might not be so easy.

One of the big changes in the upcoming tokio 0.2 crate, in fact, has been to combine and merge a lot of tokio into one crate. Previously, the components were more decoupled, but people rarely took advantage of that. Therefore, tokio 0.2 combined a lot of components and made the experience of using them together more streamlined, although it is still possible to use components in a more “standalone” fashion.

In general, to make tokio work, you need some form of “driver thread”. Typically this is done by spawning a background thread, you can skip that and run the driver yourself.

The original tokio design had a static global that contained this driver information, but this had a number of issues in practice: the driver sometimes started unexpectedly, it could be hard to configure, and it didn’t work great for embedded environments. Therefore, the new system has switched to an explicitly launch, though there are procedural macros #[tokio::main] or #[tokio::test] that provide sugar if you prefer.

What should we do next? Stabilize stream.

Next we discussed which concrete actions made sense next. Carl felt that an obvious next step would be to stabilize the Stream trait. As you may recall, cramertj and I discussed the Stream trait in quite a lot of detail – in short, the existing design for Stream is “detached”, meaning that it must yield up ownership of each item it produces, much like an Iterator. It would be nice to figure out the story for “attached” streams that can re-use internal buffers, which are a very common use case, especially before we create syntactic sugar.

Carl’s motivation for a stable Stream is in part that he would like to issue a stable tokio release, ideally in Q3 of 2020, and Stream would be a part of that. If there is no Stream trait in the standard libary, that complicates things.

One thing we didn’t discuss, but which I personally would like to understand better, is what sort of libraries and infrastructure might benefit from a stabilized Stream. For example, “data libraries” like hyper mostly want a trait like AsyncRead to be stabilized.

About async read

Next we discussed the AsyncRead trait a little, though not in great depth. If you’ve been following the latest discussion, you’ll have seen that there is a tokio proposal to modify the AsyncRead traits used within tokio. There are two main goals here:

to make it safe to pass an uninitialized memory buffer to read
to better support vectorizing writes

However, there isn’t a clear consensus on the thread (at least not the last time I checked) on the best alternative design. The PR itself proposes changing from a &mut [u8] buffer (for writing the output into) to a dyn trait value, but there are other options. Carl for example proposed using a concrete wrapper struct instead, and adding methods to test for vectorization support (since outer layers may wish to adopt different strategies based on whether vectorization works).

One of the arguments in favor of the current design from the futures crate is that it maps very cleanly to the Read trait from the stdlib ([cramertj advanced this argument][c3], for example). Carl felt that the trait is already quite different (e.g., notably, it uses Pin) and that these more “analogous” interfaces could be made with defaulted helper methods instead. Further, he felt that async applications tend to prize performance more highly than synchronous ones, so the importance and overhead of uninitialized memory may be higher.

About async destructors and other utilities

We discussed async destructors. Carl felt that they would be a valuable thing to add for sure. He felt that the “general design” proposed by boats would be reasonable, although he thought there might be a bit of a duplication issue if you have both a async drop and a sync drop. A possible solution would be to have a prepare_to_drop async method that gives the object time to do async preparations, and then to always run the sync drop afterwards.

We also discussed a few utility methods like select!, and Carl mentioned that a lot of the ecosystem is currently using things like proc-macro-hack to support these, so perhaps a good thing to focus on would be improving procedural macro support so that it can handle expression level macros more cleanly.

Comments?

There is a thread on the Rust users forum for this series.

Footnotes

I think [loom] looks particularly cool. [loom]: https://crates.io/crates/loom ↩︎