Claiming, auto and otherwise
21 June 2024
This blog post proposes adding a third trait, Claim
, that would live alongside Copy
and Clone
. The goal of this trait is to improve Rust’s existing split, where types are categorized as either Copy
(for “plain old data”1 that is safe to memcpy
) and Clone
(for types that require executing custom code or which have destructors). This split has served Rust fairly well but also has some shortcomings that we’ve seen over time, including maintenance hazards, performance footguns, and (at times quite significant) ergonomic pain and user confusion.
TL;DR
The proposal in this blog post has three phases:
- Adding a new
Claim
trait that refinesClone
to identify “cheap, infallible, and transparent” clones (see below for the definition, but it explicitly excludes allocation). Explicit calls tox.claim()
are therefore known to be cheap and easily distinguished from calls tox.clone()
, which may not be. This makes code easier to understand and addresses existing maintenance hazards (obviously we can bikeshed the name). - Modifying the borrow checker to insert calls to
claim()
when using a value from a place that will be used later. So given e.g. a variabley: Rc<Vec<u32>>
, an assignment likex = y
would be transformed tox = y.claim()
ify
is used again later. This addresses the ergonomic pain and user confusion of reference-counted values in rust today, especially in connection with closures and async blocks. - Finally, disconnect
Copy
from “moves” altogether, first with warnings (in the current edition) and then errors (in Rust 2027). In short,x = y
would movey
unlessy: Claim
. MostCopy
types would also beClaim
, so this is largely backwards compatible, but it would let us rule out cases likey: [u8; 1024]
and also extendCopy
to types likeCell<u32>
or iterators without the risk of introducing subtle bugs.
For some code, automatically calling Claim
may be undesirable. For example, some data structure definitions track reference count increments closely. I propose to address this case by creating a “allow-by-default” automatic-claim
lint that crates or modules can opt-into so that all “claims” can be made explicit. This is more-or-less the profile pattern, although I think it’s notable here that the set of crates which would want “auto-claim” do not necessarily fall into neat categories, as I will discuss.
Step 1: Introducing an explicit Claim
trait
Quick, reading this code, can you tell me anything about it’s performance characteristics?
tokio::spawn({
// Clone `map` and store it into another variable
// named `map`. This new variable shadows the original.
// We can now write code that uses `map` and then go on
// using the original afterwards.
let map = map.clone();
async move { /* code using map */ }
});
/* more code using map */
Short answer: no, you can’t, not without knowing the type of map
. The call to map.clone()
may just be cloning a large map or incrementing a reference count, you can’t tell.
One-clone-fits-all creates a maintenance hazard
When you’re in the midst of writing code, you tend to have a good idea whether a given value is “cheap to clone” or “expensive”. But this property can change over the lifetime of the code. Maybe map
starts out as an Rc<HashMap<K, V>>
but is later refactored to HashMap<K, V>
. A call to map.clone()
will still compile but with very different performance characteristics.
In fact, clone
can have an effect on the program’s semantics as well. Imagine you have a variable c: Rc<Cell<u32>>
and a call c.clone()
. Currently this creates another handle to the same underlying cell. But if you refactor c
to Cell<u32>
, that call to c.clone()
is now creating an independent cell. Argh. (We’ll see this theme, of the importance of distinguishing interior mutability, come up again later.)
Proposal: an explicit Claim
trait distinguishing “cheap, infallible, transparent” clones
Now imagine we introduced a new trait Claim
. This would be a subtrait of Clone
that indicates that cloning is:
- Cheap: Claiming should complete in O(1) time and avoid copying more than a few cache lines (64-256 bytes on current arhictectures).
- Infallible: Claim should not encounter failures, even panics or aborts, under any circumstances. Memory allocation is not allowed, as it can abort if memory is exhausted.
- Transparent: The old and new value should behave the same with respect to their public API.
The trait itself could be defined like so:2
trait Claim: Clone {
fn claim(&self) -> Self {
self.clone()
}
}
Now when I see code calling map.claim()
, even without knowing what the type of map
is, I can be reasonably confident that this is a “cheap clone”. Moreover, if my code is refactored so that map
is no longer ref-counted, I will start to get compilation errors, letting me decide whether I want to clone
here (potentially expensive) or find some other solution.
Step 2: Claiming values in assignments
In Rust today, values are moved when accessed unless their type implement the Copy
trait. This means (among other things) that given a ref-counted map: Rc<HashMap<K, V>>
, using the value map
will mean that I can’t use map
anymore. So e.g. if I do some_operation(map)
, then gives my handle to some_operation
, preventing me from using it again.
Not all memcopies should be ‘quiet’
The intention of this rule is that something as simple as x = y
should correspond to a simple operation at runtime (a memcpy, specifically) rather than something extensible. That, I think, is laudable. And yet the current rule in practice has some issues:
- First,
x = y
can still result in surprising things happening at runtime. Ify: [u8; 1024]
, for example, then a few simple calls likeprocess1(y); process2(y);
can easily copy large amounts of data (you probably meant to pass that by reference). - Second, seeing
x = y.clone()
(or evenx = y.claim()
) is visual clutter, distracting the reader from what’s really going on. In most applications, incrementing ref counts is simply not that interesting that it needs to be called out so explicitly.
Some things that should implement Copy
do not
There’s a more subtle problem: the current rule means adding Copy
impls can create correctness hazards. For example, many iterator types like std::ops::Range<u32>
and std::vec::Iter<u32>
could well be Copy
, in the sense that they are safe to memcpy. And that would be cool, because you could put them in a Cell
and then use get
/set
to manipulate them. But we don’t implement Copy
for those types because it would introduce a subtle footgun:
let mut iter0 = vec.iter();
let mut iter1 = iter0;
iter1.next(); // does not effect `iter0`
Whether this is surprising or not depends on how well you know Rust – but definitely it would be clearer if you had to call clone
explicitly:
let mut iter0 = vec.iter();
let mut iter1 = iter0.clone();
iter1.next();
Similar considerations are the reason we have not made Cell<u32>
implement Copy
.
The clone/copy rules interact very poorly with closures
The biggest source of confusion when it comes to clone/copy, however, is not about assignments like x = y
but rather closures and async blocks. Combining ref-counted values with closures is a big stumbling block for new users. This has been true as long as I can remember. Here for example is a 2014 talk at Strangeloop in which the speaker devotes considerable time to the “accidental complexity” (their words, but I agree) they encountered navigating cloning and closures (and, I will note, how the term clone is misleading because it doesn’t mean a deep clone). I’m sorry to say that the situation they describe hasn’t really improved much since then. And, bear in mind, this speaker is a skilled programmer. Now imagine a novice trying to navigate this. Oh boy.
But it’s not just beginners who struggle! In fact, there isn’t really a convenient way to manage the problem of having to clone a copy of a ref-counted item for a closure’s use. At the RustNL unconf, Jonathan Kelley, who heads up the Dioxus Labs, described how at CloudFlare codebase they spent significant time trying to find the most ergonomic way to thread context (and these are not Rust novices).
In that setting, they had a master context object cx
that had a number of subsystems, each of which was ref-counted. Before launching a new task, they would handle out handles to the subsystems that task required (they didn’t want every task to hold on to the entire context). They ultimately landed on a setup like this, which is still pretty painful:
let _io = cx.io.clone():
let _disk = cx.disk.clone():
let _health_check = cx.health_check.clone():
tokio::spawn(async move {
do_something(_io, _disk, _health_check)
})
You can make this (in my opinion) mildly better by leveraging variable shadowing, but even then, it’s pretty verbose:
tokio::spawn({
let io = cx.io.clone():
let disk = cx.disk.clone():
let health_check = cx.health_check.clone():
async move {
do_something(io, disk, health_check)
}
})
What you really want is to just write something like this, like you would in Swift or Go or most any other modern language:3
tokio::spawn(async move {
do_something(cx.io, cx.disk, cx.health_check)
})
“Autoclaim” to the rescue
What I propose is to modify the borrow checker to automatically invoke claim
as needed. So e.g. an expression like x = y
would be automatically converted to x = y.claim()
if y
will be used again later. And closures that capture variables in their environment would respect auto-claim as well, so move || process(y)
would become { let y = y.claim(); move || process(y) }
if y
were used again later.
Autoclaim would not apply to the last use of a variable. So x = y
only introduces a call to claim
if it is needed to prevent an error. This avoids unnecessary reference counting.
Naturally, if the type of y
doesn’t implement Claim
, we would give a suitable error explaining that this is a move and the user should insert a call to clone
if they want to make a cloned value.
Support opt-out with an allow-by-default lint
There is definitely some code that benefits from having the distinction between moving an existing handle and claiming a new one made explicit. For these cases, what I think we should do is add an “allow-by-default” automatic-claim
lint that triggers whenever the compiler inserts a call to claim
on a type that is not Copy
. This is a signal that user-supplied code is running.
To aid in discovery, I would consider a automatic-operations
lint group for these kind of “almost always useful, but sometimes not” conveniences; effectively adopting the profile pattern I floated at one point, but just by making it a lint group. Crates could then add automatic-operations = 'deny"
(bikeshed needed) in the [lints]
section of their Cargo.toml
.
Step 3. Stop using Copy
to control moves
Adding “autoclaim” addresses the ergonomic issues around having to call clone
, but it still means that anything which is Copy
can be, well, copied. As noted before that implies performance footguns ([u8;1024]
is probably not something to be copied lightly) and correctness hazards (neither is an iterator).
The real goal should be to disconnect “can be memcopied” and “can be automatically copied”4. Once we have “autoclaim”, we can do that, thanks to the magic of lints and editions:
- In Rust 2024 and before, we warn when
x = y
copies a value that isCopy
but notClaim
. - In the next Rust edition (Rust 2027, presumably), we make it a hard error so that the rule is just tied to
Claim
trait.
At codegen time, I would still expect us to guarantee that x = y
will memcpy and will not invoke y.claim()
, since technically the Clone
impl may not be the same behavior; it’d be nice if we could extend this guarantee to any call to clone
, but I don’t know how to do that, and it’s a separate problem. Furthermore, the automatic_claims
lint would only apply to types that don’t implement Copy
.5
Frequently asked questions
All right, I’ve laid out the proposal, let me dive into some of the questions that usually come up.
Are you ??!@$!$! nuts???
I mean, maybe? The Copy/Clone split has been a part of Rust for a long time6. But from what I can see in real codebases and daily life, the impact of this change would be a net-positive all around:
- For most code, they get less clutter and less confusing error messages but the same great Rust taste (i.e., no impact on reliability or performance).
- Where desired, projects can enable the lint (declaring that they care about performance as a side benefit). Furthermore, they can distinguish calls to
claim
(cheap, infallible, transparent) from calls toclone
(anything goes).
What’s not to like?
What kind of code would #[deny(automatic_claims)]
?
That’s actually an interesting question! At first I thought this would correspond to the “high-level, business-logic-oriented code” vs “low-level systems software” distinction, but I am no longer convinced.
For example, I spoke with someone from Rust For Linux who felt that autoclaim would be useful, and it doesn’t get more low-level than that! Their basic constraint is that they want to track carefully where memory allocation and other fallible operations occur, and incrementing a reference count is fine.
I think the real answer is “I’m not entirely sure”, we have to wait and see! I suspect it will be a fairly small, specialized set of projects. This is part of why I this this is a good idea.
Well my code definitely wants to track when ref-counts are incremented!
I totally get that! And in fact I think this proposal actually helps your code:
- By setting
#![deny(automatic_claims)]
, you declare up front the fact that reference counts are something you track carefully. OK, I admit not everything will consider this a pro. Regardless, it’s a 1-time setup cost. - By distinguishing
claim
fromclone
, your project avoids surprising performance footguns (this seems inarguably good). - In the next edition, when we no longer make
Copy
implicitly copy, you further avoid the footguns associated with that (also inarguably good).
Is this revisiting RFC 936?
Ooh, deep cut! RFC 936 was a proposal to split Pod
(memcopyable values) from Copy
(implicitly memcopyable values). At the time, we decided not to do this.7 I am even the one who summarized the reasons. The short version is that we felt it better to have a single trait and lints.
I am definitely offering another alternative aiming at the same problem identified by the RFC. I don’t think this means we made the wrong decision at the time. The problem was real, but the proposed solutions were not worth it. This proposal solves the same problems and more, and it has the benefit of ~10 years of experience.8 (Also, it’s worth pointing out that this RFC came two months before 1.0, and I definitely feel to avoid derailing 1.0 with last minute changes – stability without stagnation!)
Doesn’t having these “profile lints” split Rust?
A good question. Certainly on a technical level, there is nothing new here. We’ve had lints since forever, and we’ve seen that many projects use them in different ways (e.g., customized clippy levels or even – like the linux kernel – a dedicated custom linter). An important invariant is that lints define “subsets” of Rust, they don’t change it. Any given piece of code that compiles always means the same thing.
That said, the profile pattern does lower the cost to adding syntactic sugar, and I see a “slippery slope” here. I don’t want Rust to fundamentally change its character. We should still be aiming at our core constituency of programs that prioritize performance, reliability, and long-term maintenance.
How will we judge when an ergonomic change is “worth it”?
I think we should write up some design axioms. But it turns out we already have a first draft! Some years back Aaron Turon wrote an astute analysis in the “ergonomics initiative” blog post. He identified three axes to consider:
- Applicability. Where are you allowed to elide implied information? Is there any heads-up that this might be happening?
- Power. What influence does the elided information have? Can it radically change program behavior or its types?
- Context-dependence. How much of do you have to know about the rest of the code to know what is being implied, i.e. how elided details will be filled in? Is there always a clear place to look?
Aaron concluded that "implicit features should balance these three dimensions. If a feature is large in one of the dimensions, it’s best to strongly limit it in the other two." In the case of autoclaim, the applicability is high (could happen a lot with no heads up) and the context dependence is medium-to-large (you have to know the types of things and traits they implement). We should therefore limit power, and this is why we put clear guidelines on who should implement Claim
. And of course for the cases where that doesn’t suffice, the lint can limit the applicability to zero.
I like this analysis. I also want us to consider “who will want to opt-out and why” and see if there are simple steps (e.g., ruling out allocation) we can take which will minimize that while retaining the feature’s overall usefulness.
What about explicit closure autoclaim syntax?
In a recent lang team meeting Josh raised the idea of annotating closures (and presumably async blocks) with some form of syntax that means “they will auto-capture things they capture”. I find the concept appealing because I like having an explicit version of automatic syntax; also, projects that deny automatic_claim
should have a lightweight alternative for cases where they want to be more explicit. However, I’ve not seen any actual specific proposal and I can’t think of one myself that seems to carry its weight. So I guess I’d say “sure, I like it, but I would want it in addition to what is in this blog post, not instead of”.
What about explicit closure capture clauses?
Ah, good question! It’s almost like you read my mind! I was going to add to the previous question that I do like the idea of having some syntax for “explicit capture clauses” on closures.
Today, we just have || $body
(which implicitly captures paths in $body
in some mode) and move || $body
(which implicitly captures paths in $body
by value).
Some years ago I wrote a draft RFC in a hackmd that I still mostly like (I’d want to revisit the details). The idea was to expand move
to let it be more explicit about what is captured. So move(a, b) || $body
would capture only a
and b
by value (and error if $body
references other variables). But move(&a, b) || $body
would capture a = &a
. And move(a.claim(), b) || $body
would capture a = a.claim()
.
This is really attacking a different problem, the fact that closure captures have no explicit form, but it also gives a canonical, lighterweight pattern for “claiming” values from the surrounding context.
How did you come up with the name Claim
?
I thought Jonathan Kelley suggested it to me, but reviewing my notes I see he suggested Capture
. Well, that’s a good name too. Maybe even a better one! I’ve already written this whole damn blog post using the name Claim
, so I’m not going to go change it now. But I’d expect a proper bikeshed before taking any real action.
I love Wikipedia (of course), but using the name passive data structure (which I have never heard before) instead of plain old data feels very… well, very Wikipedia. ↩︎
In point of fact, I would prefer if we could define the
claim
method as “final”, meaning that it cannot be overridden by implementations, so that we would have a guarantee thatx.claim()
andx.clone()
are identical. You can do this somewhat awkwardly by definingclaim
in an extension trait, like so, but it’d be a bit embarassing to have that in the standard library. ↩︎Interestingly, when I read that snippet, I had a moment where I thought “maybe it should be
async move { do_something(cx.io.claim(), ...) }
?”. But of course that won’t work, that would be doing the claim in the future, whereas we want to do it before. But it really looks like it should work, and it’s good evidence for how non-obvious this can be. ↩︎In effect I am proposing to revisit the decision we made in RFC 936, way back when. Actually, I have more thoughts on this, I’ll leave them to a FAQ! ↩︎
Oooh, that gives me an idea. It would be nice if in addition to writing
x.claim()
one could writex.copy()
(similar toiter.copied()
) to explicitly indicate that you are doing a memcpy. Then the compiler rule is basicaly that it will insert eitherx.claim()
orx.copy()
as appropriate for types that implementClaim
. ↩︎I’ve noticed I’m often more willing to revisit long-standing design decisions than others I talk to. I think it comes from having been present when the decisions were made. I know most of them were close calls and often began with “let’s try this for a while and see how it feels…”. Well, I think it comes from that and a certain predilection for recklessness. 🤘 ↩︎
This RFC is so old it predates rfcbot! Look how informal that comment was. Astounding. ↩︎
This seems to reflect the best and worst of Rust decision making. The best because autoclaim represents (to my mind) a nice “third way” in between two extreme alternatives. The worst because the rough design for autoclaim has been clear for years but it sometimes takes a long time for us to actually act on things. Perhaps that’s just the nature of the beast, though. ↩︎