Type system for borrowing permissions
26 September 2012
Well, I have not done too well with my goal of reading a research paper a day on the train (actually my initial goal was two papers, but seeing as how I’ve failed so spectacularly, I’ve dialed it back some). However, I’ve decided to give it another go. I’ve bought a printer now so I can print papers out (double-sided, no less!) at home (I had initially planned to buy an iPad or something, but a decent printer is only $100, and paper is still nicer to read and write notes on…you do the math). As additional motivation, I’m working again on the paper on Rust’s new borrowed pointers and so I have to catch up on a lot of related work.
To that end, I downloaded and read A Type System for Borrowing Permissions, published at POPL 2012 by Naden, Bocchino, Aldrich, and Bierhoff. It’s a very good read: pretty easy to follow and certainly directly related to a lot of what we’ve been doing in Rust. Overall, I found it to be both more and less expressive than Rust’s system. It’s hard to compare the complexity of the two: each is complex in its own way.
A brief summary of their work
The basic idea of their system is to label object types with permissions. There are several such permissions:
unique
— the only pointer to an objectimmutable
— the object can be shared but not modifiedshared
— the object can be freely shared and modifiedlocal immutable
— the object can be aliased from local variables but not mutatedlocal shared
— the object can be aliased from local variables and modified
Each variable is annotated with such a permission. Parameters are annotated not
only with a permission but with a “change” permission. For example, something like
Rust’s send()
function, which consumes its parameter, would be written:
void send<T>(unique >> none T send) { ... }
This indicates that the value must begin as unique but, after the function call, will have the permission “none”. You can also annotate the receiver in the typical, C++-inspired style:
void foo() unique { /* Can only be called if the receiver is unique */ }
Unique objects can be temporarily aliased using the “local”
permissions, such as local shared
or local immutable
. Such
objects cannot be used outside of local variables.
It is also possible to temporary alias fields of unique type, so long as they are stored in unique objects. This causes the object to be only partially valid: an aliased unique field is no longer unique, after all. An object in this state is called “unpacked”: it must be packed again before it can be passed around. Unpacking like this is only permitted when the owner of the field is unique.
Comparisons to Rust
This is quite close to Rust’s lifetime and borrowing system in many ways. The notion of packing and unpacking operates much like borrow check’s loans, for example. There are of course some differences.
How it is less expressive than Rust?
Storing borrowing items into data structures
There are two areas where our system is more expressive. The first is that, in their system, borrowed, aliasable things (“local permissions”) are confined to local variables; our borrowed pointers suffer from no such limitations. In fact, lifting this restriction was probably the major goal of the shift to using borrowed pointers in place of reference modes.
An example of where this limitation is problematic is when you have
some largish function that needs some context to perform its
computation. For example, in our compiler translation pass, you get
as input a “type context” (&TypeContext
) that contains information
about the types of all expressions. Then we create a crate context
(&CrateContext
) that is local to the compilation pass. Moreover we
make a function (&FnContext
) for each function and a block
(&BlockContext
) for each basic block. Each of these embeds a
pointer to the context up the chain, so from the block context you can
reach the function context, then the crate context, and finally the
type context. A system confined to local variables cannot express
this sort of pattern without passing each of the various contexts as a
parameter. You wind up with a lot of functions all taking some big
set of parameters initially—the usual solution is to refactor by
making a single object to pass around, but you won’t be able to do
that if you are using local aliases.
Regions allow us to move past this problem. We can permit arbitrary aliasing, secure in the knowledge that the type system will prevent these aliases from escaping out into the wild. In other words, we know that even if you package up all those redundant arguments into a struct, that struct itself will never outlive the function where it is created. This sort of reasoning is absolutely crucial to effectively support stack allocation, which is a primary goal of Rust (but which I doubt is a goal for Plaid; nonetheless I am sure they will encounter this problem from time to time).
Inherited mutability
The second is that our notion of inherited mutability lets us
freeze an entire family of owned objects at once. I intend to write
another tutorial-style blog post on this, so for now I’ll just say
that if you own a freezable data structure—and all the various
container types in Rust will eventually be freezable, I think—you
can convert it between mutable and immutable and back again. They
have a similar notion of converting unique
to immutable
, but as
far as I can tell this is a shall immutability that does not itself
extend to the unique objects within. It’s possible that I am
misreading the type rules on this point though. Also, I think Rust’s
notion of inherited mutability could be applied to their system
without great difficulty.
How is it more expressive than Rust?
The main thing which they can do that we cannot is express the idea of
borrowing a unique pointer but retaining the knowledge that the
pointer is unique. For us, borrowing is intrinsically tied to
aliasing—for one thing, borrowing a unique value does not make the
original inaccessible, it just lessens the things you can do with it.
For another, we always allow &T
types to be copied.
Basically, in our system, a ~T
type is always owned. So if you
write:
fn foo(bar: ~Map<K,V>) {
bar.insert(k, v);
for bar.each |k, v| {
... // (bar is immutable here)
}
}
This is a function which takes in a map, modifies it some, iterates
over it, and then (implicitly) frees it. The same function could, if
it wanted to, continue by sending bar
to another task, or doing
whatever. But, of course, it doesn’t here, and perhaps what I really
wanted was just to do some work on behalf of the caller foo()
.
Typically, the way that one expresses the idea that you want to take
a Map
but not consume it is to use a borrowed pointer:
fn foo(bar: &Map<K,V>) {
bar.insert(k, v); // (Error)
for bar.each |k, v| {
... // (bar is immutable here)
}
}
Unfortunately, as indicated in the comments, this code will not
compile. This is because when you borrow a freezable data structure
like a map, you must select if it is immutable (&Map<K,V>
) or
mutable (&mut Map<K,V>
). But this example uses insert()
, which
requires mutability, and iteration, which requires immutability
(modifying the map while iterating is a no-no).
There are ways to solve this problem. One is that, in the caller, you
could use a ~Mut<Map<K,V>>
: the Mut
type basically moves the
safety checks from compile-time to runtime. This means that if you
try to insert into the map while iterating your task will fail (just
as it would in Java, for example, or most other languages). However,
using the Mut
type adds some (slight) overhead and gives you fewer
static guarantees, so it is not ideal. It’s really intended for use
with managed data (e.g., @Mut<Map<K,V>>
), where there is no hope of
tracking the mutability statically since the map can be aliased.
Another option is to move the map into the function foo()
and then
return it back to the caller when foo()
has finished:
fn foo(mut bar: ~Map<K,V>) -> ~Map<K,V> {
bar.insert(k, v);
for bar.each |k, v| {
... // (bar is immutable here)
}
return map;
}
This is a bit annoying, but it works. It’s similar to other affine systems like Alms.
Under the system proposed in this paper, however, a parameter labeled
unique Map<K,V>
would not be consumed by the callee. Basically it
must be unique on entry to the callee and on exit. If the callee
wished to consume the map, such as by sending it to another task, it
would write unique >> none Map<K,V>
.