Baby Steps

A blog about programming, and tiny ways to improve it.

Closures and the Borrow Checker

I have been working on making the borrow checker treat closures in a sound way. I hope to land this patch very soon. I want to describe the impact of these changes and summarize what we might do in the future.

Rvalue Lifetimes in Rust

I’ve been working on Issue #3511, which is an effort to rationalize the lifetimes of temporary values in Rust. This issue has been a thorn in the side of Rust users for a while, because the current lifetimes are rather haphazard and frequently too short. Some time ago, I did some thinking on this issue and then let it lie while other things took priority.

Part of the reason that this issue has lasted so long is that the current trans cleanup scheme is very inflexible. I have a branch now that rewrites the cleanup system so that it can handle any rules we would like. The problem I am encountering now, of course, is that it’s unclear what the rules should be. I want to lay out the options I see.

DST, Take 5

I believe I have come to the point where I am ready to make a final proposal for DST. Ironically, this proposal is quite similar to where I started, but somewhat more expansive. It seems to be one of those unusual cases where supporting more features actually makes things easier. Thanks to Eridius on IRC for pointing this out to me. I intend for this post to stand alone, so I’m going to start from the beginning in the description.

I am reasonably confident that this DST proposal hangs together because I have taken the time to develop a formal model of Rust and then to extend that model with DST. The model was done in Redex and can be found here. Currently it only includes reduction semantics. I plan a separate blog post describing the model in detail.


Dynamically sized types

The core idea of this proposal is to introduce two new types [T] and Trait. [T] represents “some number of instances of T laid out sequentially in memory”, but the exact number if unknown. Trait represents “some type T that implements the trait Trait”.

Both of these types share the characteristic that they are existential variants of existing types. That is, there are corresponding types which would provide the compiler with full static information. For example, [T] can be thought of as an instance of [T, ..n] where the constant n is unknown. We might use the more traditional – but verbose – notation of exists n. [T, ..n] to describe the type. Similarly, Trait is an instance of some type T that implements Trait, and hence could be written exists T:Trait. T. Note that I do not propose adding existential syntax into Rust; this is simply a way to explain the idea.

These existential types have an important facet in common: their size is unknown to the compiler. For example, the compiler cannot compute the size of an instance of [T] because the length of the array is unknown. Similarly, the compiler cannot compute the size of an instance of Trait because it doesn’t know what type that really is. Hence I refer to these types as dynamically sized – because the size of their instances is not known at compilation time. More often, I am sloppy and just call them unsized, because everybody knows that – to a compiler author, at least – compile time is the only interesting thing, so if we don’t know the size at compile time, it is equivalent to not knowing it at all.

Restrictions on dynamically sized types

Because the type (and sometimes alignment) of dynamically sized types is unknown, the compiler imposes various rules that limit how instances of such types may be used. In general, the idea is that you can only manipulate an instance of an unsized type via a pointer. So for example you can have a local variable of type &[T] (pointer to array of T) but not [T] (array of T).

Pointers to instances of dynamically sized types are fat pointers – that means that they are two words in size. The secondary word describes the “missing” information from the type. So a pointer like &[T] will consist of two words: the actual pointer to the array, and the length of the array. Similarly, a pointer like &Trait will consist of two words: the pointer to the object, and a vtable for Trait.

I’ll cover the full restrictions later, but the most pertinent are:

  1. Variables and arguments cannot have dynamically sized types.
  2. Only the last field in a struct may have a dynamically sized type; the other fields must not. Enum arguments must not have dynamically sized types.

unsized keyword

Any type parameter which may be instantiated with an unsized type must be designed using the unsized keyword. This means that <T> is not the most general definition for a type parameter; rather it should be <unsized T>.

I originally preferred for all parameters to be unsized by default. However, it seems that the annotation burden here is very high, so for the moment we’ve been rejecting this approach.

The unsized keyword crops up in a few unlikely places. One particular place that surprised me is in the declaration of traits, where we need a way to annotate whether the Self type may be unsized. It’s not entirely clear what this syntax should be. trait Foo<unsized Self>, perhaps? This would rely on Self being a keyword, I suppose. Another option is trait Foo : unsized, since that is typically where bounds on Self appear. (TBD)

Bounds in type definitions

Currently we do not permit bounds in type declarations. The reasoning here was basically that, since a type declaration never invokes methods, it doesn’t need bounds, and we could mildly simplify things by leaving them out.

But the DST scheme needs a way to tag type parameters as potentially unsized, which is a kind of bound (in my mind). Moreover, we [also need bounds to handle destructors][drop], so I think this rule against bounds in structs is just not tenable.

Once we permit bounds in structs, we have to decide where to enforce them. My proposal is that we check bounds on the type of every expression. Another option is just to check bounds on struct literals; this would be somewhat more efficient and is theoretically equivalent, since it ensures that you will not be able to create an instance of a struct that does not meet the struct’s declared boundaries. However, it fails to check illegal transmute calls.

Creating an instance of a dynamically sized type

Instances of dynamically sized types are obtained by coercing an existing instance of a statically sized type. In essence, the compiler simply “forgets” a piece of the static information that it used to know (such as the length of the vector); in the process, this static bit of information is converted into a dynamic value and added into the resulting fat pointer.


The most obvious cases to be permitted are coercions from built-in pointer types, such as &[T, ..n] to &[T] or &T to &Trait (where T:Trait). Less obvious are the rules to support coercions for smart pointer types, such as Rc<[T, ..n]> being casted to Rc<[T]>.

This is a bit more complex than it appears at first. There are two kinds of conversions to consider. This is easiest to explain by example. Let us consider a possible definition for a reference-counting smart pointer Rc:

struct Rc<unsized T> {
    ptr: *RcData<T>,

    // In this example, there is no need of more fields, but
    // for purposes of illustration we can imagine that there
    // are some additional fields here:
    dummy: uint

struct RcData<unsized T> {
    ref_count: uint,

    #[max_alignment] // explained later
    data: T,

From this definition you can see that a reference-counted pointer consists of a pointer to an RcData struct. The RcData struct embeds a reference count followed by the data from the pointer itself.

We wish to permit a type like Rc<[T, ..n]> to be cast to a type like Rc<[T]>. This is shown in the following code snippet.

let rc1: Rc<[T, ..3]> = ...;
let rc2: Rc<[T]> = rc1 as RC<[T]>;

What is interesting here is that the type we are casting to, RC<[T]>, is not actually a pointer to an unsized type. It is a struct that contains such a pointer. In other words, we could convert the code fragment above into something equivalent but somewhat more verbose:

let rc1: Rc<[T, ..3]> = ...;
let rc2: Rc<[T]> = {
    let Rc { ptr: ptr1, dummy: dummy1 } = rc1;
    let ptr2 = ptr as *RcData<[T]>;
    Rc { ptr: ptr2, dummy: dummy }

In this example, we have unpacked the pointer (and dummy field) out of the input rc1 and then cast the pointer itself. This second cast, from ptr1 to ptr2, is a cast from a thin pointer to a fat pointer. We then repack the data to create the new pointer. The fields in the new pointer are the same, but because the ptr field has been converted from a thin pointer to a fat pointer, the offsets of the dummy field will be adjusted accordingly.

So basically there are two cases to consider. The first is the literal conversion from thin pointers to fat pointers. This is relatively simple and is defined only over the builtin pointer types (currently: &, *, and ~). The second is the conversion of a struct which contains thin pointer fields into another instance of that same stuct type where fields are fat pointers. The next section defines these rules in more detail.

Conversion rules

Let’s start with the rule for converting thin pointers into fat pointers. This is based on the relation Fat(T as U) = v. This relation says that a pointer to T can be converted to a fat pointer to U by adding the value v. Afterwards, we’ll define the full rules that define when T as U is permitted.

Conversion from thin to fat pointers

There are three cases to define the Fat() function. The first rule Fat-Array permits converting a fixed-length array type [T, ..n] into the type [T] for an of unknown length. The second half of the fat pointer is just the array length in that case.

  Fat([T, ..n] as [T]) = n

The second rule Fat-Object permits a pointer to some type T to be coerced into an object type for the trait Trait. This rule has three conditions. The first condition is simply that T must implement Trait, which is fairly obvious. The second condition is that T itself must be sized. This is less obvious and perhaps a bit unfortunate, as it means that even if a type like [int] implements Trait, we cannot create an object from it. This is for implementation reasons: the representation of an object is always (pointer, vtable), no matter the type T that the pointer points at. If T were dynamically sized, then pointer would have to be a fat pointer – since we do not known T at compile time, we would have no way of knowing whether pointer was a thin or fat pointer. What’s worse, the size of fat pointers would be effectively unbounded. The final condition in the rule is that the type T has a suitable alignment; this rule may not be necessary. See Appendix A for more discussion.

  T implements Trait
  T is sized
  T has suitable alignment (see Appendix A)
  Fat(T as Trait) = vtable

The final rule Fat-Struct permits a pointer to a struct type to be coerced so long as all the fields will still have the same type except the last one, and the last field will also be a legal coercion. This rule would therefore permit Fat(RcData<[int, ..3]> as RcData<[int]>) = 3, for example, but not Fat(RcData<int> as RcData<float>).

  T_0 ... T_i T_n = field-types(R<T ...>)
  T_0 ... T_i U_n = field-types(R<U ...>)
  Fat(T_n as U_n) = v
  Fat(R<T...> as R<U...>) = v
Coercion as a whole

Now that we have defined that Fat() function, we can define the full coercion relation T as U. This relation states that the type T is coercable to the type U; I’m just focusing on the DST-related coercions here, though we do in fact do other coercions. The rough idea is that the compiler allows coercion not only simple pointers but also structs that include pointers. This is needed to support smart pointers, as we’ll see.

The first and simplest rule is the identity rule, which states that we can “convert” a type T into a type T (this is of course just a memcpy at runtime – note that if T is affine then coercion consumes the value being converted):

  T as T

The next rule states that we can convert a thin pointer into a fat pointer using the Fat() rule that we described above. For now I’ll just give the rule for unsafe pointers, but analogous rules can be defined for borrowed pointers and ~ pointers:

  Fat(T as U) = v
  *T as *U

Finally, the third rule states that we can convert a struct R<T...> into another instance of the same struct R<U...> with different type parameters, so long as all of its fields are pairwise convertible:

  T_0 ... T_n = field-types(R<T ...>)
  U_0 ... U_n = field-types(R<U ...>)
  forall i. T_i as U_i
  R<T...> as R<U...>

The purpose of this rule is to support smart pointer coercion. Let’s work out an example to see what I mean. Imagine that I define a smart pointer type Rc for ref-counted data, building on the RcData type I introduced earlier:

struct Rc<unsized U> {
    data: *RcData<U>

Now if I had a ref-counted, fixed-length array of type Rc<[int, ..3]>, I might want to coerce this into a variable-length array Rc<[int]>. This is permitted by the Coerce-Struct rule. In this case, the Rc type is basically a newtyped pointer, so it’s particularly simple, but we can permit coercions so long as the individual fields either have the same type or are converted from a thin pointer into a fat pointer.

These rules as I presented them are strictly concerned with type checking. The code generation of such casts is fairly straightforward. The identity relation is a memcpy. The thin-to-fat pointer conversion consists of copying the thin pointer and adding the runtime value dictated by the Fat function. The struct conversion is just a recursive application of these two operations, keeping in mind that the offset in the destination must be adjusted to account for the size of the increased size of the fat pointers that are produced.

Working with values of dynamically sized types

Just as with coercions, working with DST values can really be described in two steps. The first are the builtin operators. These are only defined over builtin pointer types like &[T]. The second is the method of converting an instance of a smart pointer type like RC<[T]> into an instance of a builtin pointer type. We’ll start by examining the mechanism for converting smart pointers into builtin pointer types, and then examine the operations themselves.

Side note: custom deref operator

The key ingredients for smart pointer integration is an overloadable deref operator. I’ll not go into great detail, but the basic idea is to define various traits that can be implemented by smart pointer types. The precise details of these types merits a separate post and is somewhat orthogonal, but let’s just examine the simplest case, the ImmDeref deref:

trait ImmDeref<unsized T> {
    fn deref<'a>(&'a self) -> &'a T;

This trait would be implemented by most smart pointer types. The type parameter T is the type of the smart pointer’s referent. The trait says that, given a (borrowed) smart pointer instance with lifetime 'a, you can dereference that smart pointer and obtain a borrowed pointer to the referent with lifetime 'a.

For example, the Rc type defined earlier might implement ImmDeref as follows:

impl<unsized T> ImmDeref<T> for Rc<T> {
    fn deref<'a>(&'a self) -> &'a T {
        unsafe {

Note: As implied by my wording above, I expect there will be a small number of deref traits, basically to encapsulate different mutability behaviors and other characteristics. I’ll go into this in a separate post.

Indexing into variable length arrays

rustc natively supports indexing into two types: [T] (this is a fat pointer, where the bounds are known dynamically) and [T, ..n] (a fixed length array, where the bounds are known statically). In the first case, the type rules ensure that the [T] value will always be located as the referent of a fat pointer, and the bounds can be loaded from the fat pointer and checked dynamically. In the second case, the bounds are inherent in the type (though they must still be checked, unless the index is a compile-time constant). In addition, the set of indexable types can be extended by implementing the Index trait. I will ignore this for now as it is orthogonal to DST.

To participate in indexing, smart pointer types do not have to do anything special, they need simply overload deref. For example, given an expresion r[3] where r has type Rc<[T]>, the compiler will handle it as follows:

  1. The type Rc<[T]> is not indexable, so the compiler will attempt to dereference it as part of the autoderef that is associated with the indexing operator.
  2. Deref succeeds because Rc implements the ImmDeref trait described above. The type of *r is thus &[T].
  3. The type &[T] is not indexable either, but it too can be dereferenced.
  4. We now have an lvalue **r of type [T]. This type is indexable, so the search completes.

Invoking methods on objects

This works in a similar fashion to indexing. The normal autoderef process will lead to a type like Rc<Trait> being converted to &Trait, and from there method dispatch proceeds as normal.

Drop glue and so on

There are no particular challenges here that I can see. When asked to drop a value of type [T] or Trait, the information in the fat pointer should be enough for the compiler to proceed.

By way of example, here is how I imagine Drop would be implemented for Rc:

impl<unsized T> Drop for Rc<T> {
    fn drop<'a>(&'a mut self) {
        unsafe {
            intrinsics::drop(&mut (*self.ptr).data);


A. Alignment for fields of unsized type

There is one interesting subtlety concerning access to fields of type Trait – in such cases, the alignment of the field’s type is unknown, which means the compiler cannot statically compute the field’s offset. There are two options:

  1. Extract the alignment information from the vtable, which is present, and compute the offset dynamically. More complicated to codegen but retains maximal flexibility.
  2. Require that the alignment for fields of (potentially) unsized type be statically specified, or else devise custom alignment rules for such fields corresponding to the same alignment used by malloc().

The latter is less flexible in that it implies types with greater alignment requirements cannot be made into objects, and it also implies that structs with low-alignment payloads, like RC<u8>, may be bigger than they need to be, strictly speaking.

The other odd thing about solution #2 is that it implies that a generic structure follows separate rules from a specific version of that structure. That is, given declarations like the following:

struct Foo1<T> { x: T }
struct Foo2<unsized U> { x: U }
struct Foo3 { x: int }

It is currently true that Foo1<int>, Foo2<int>, and Foo3 are alike in every particular. But under solution 2 the alignment of Foo2<int> may be greater than Foo1<int> or Foo3 (since the field x was declared with unsized type U and hence has maximal alignment).

B. Traits and objects

Currently, we only permit method calls with an object receiver if the method meets two conditions:

  • The method does not employ the Self type except as the type of the receiver.
  • The method does not have any type parameters.

The reason for the first restriction is that, in an object, we do not know what value the type parameter Self is instantiated with, and therefore we cannot type check such a call. The reason for the second restriction is that we can only put a single function pointer into the vtable and, under a monomorphization scheme, we potentially need an infinite number of such methods in the vtable. (We could lift this restriction if we supported an “erased” type parameter system, but that’s orthogonal.)

Under a DST like system, we can easily say that, for any trait Trait where all methods meet the above restrictions, then the dynamically sized type Trait implements the trait Trait. This seems rather logical, since the type Trait represents some unknown type T that implements Trait. We must however respect the above two restrictions, since we will still be dispatching calls dynamically.

(It might even be simpler, though less flexible, to just say that we can only create objects for traits that meet the above two restrictions.)

Structural Arrays in Typed Objects

Dave Herman and I were tossing around ideas the other day for a revision of the typed object specification in which we remove nominal array types. The goal is to address some of the awkwardness that we have encountered in designing the PJS API due to nominal array types. I thought I’d try writing it out. This is to some extent a thought experiment.

Thoughts on DST, Part 4

Over the Thanksgiving break I’ve been devoting a lot of time to thinking about DST and Rust’s approach to vector and object types. As before, this is very much still churning in my mind so I’m just going to toss out some semi-structured thoughts.

Brief recap

Treating vectors like any other container. Some time back, I wrote up a post about how we could treat vectors like any other container, which would (to some extent) avoid the need for DST.

Dynamically sized types (DST). In Part 1 of the series, I sketched out how “Dynamically Sized Types” might work. In that scheme, [T] is interpreted as an existential type like exists N. [T, ..N], and Trait is interpreted as exists T:Trait. T. The type system ensures that DSTs always appear behind one of the builtin pointer types, and those pointer types become fat pointers:

  • Advantage. Impls for objects and vectors work really well.
  • Disadvantage. Hard to square with user-defined smart pointers like RC<[int]>. The problem is worse than I presented in that post, I’ll elaborate a bit more.

Statically sized types (SST). In Part 2 of the series, I sketched out an alternative scheme that I later dubbed “Statically Sized Types”. In this scheme, in some ways similar to today, [T] and Trait are not themselves types, but rather shorthands for existential types where the exists qualifier is moved outside the smart pointer. For example, ~[T] becomes exists N. ~[T, ..N]. The scheme does not involve fat pointers; rather, the existential type carries the length, and the thin pointer is embedded within the existential type.

  • Advantage. It is easy to create a type like RC<[int]> from an existing RC<[int, ..N]> (and, similarly, an RC<Trait> from an existing RC<T>).
  • Disadvantage. Incompatible with monomorphization except via virtual calls. I described part of the problem in Part 3 of the series. I’ll elaborate a bit more here.

Where does that leave us?

So, basically, we are left with two flawed schemes. In this post I just want to elaborate on some of the thoughts I had over Thanksgiving. Roughly speaking they are three:

  1. DST and smart pointer interaction is even less smooth than I thought, but workable for RC at least.
  2. SSTs, vectors, and smart pointers are just plain unworkable.
  3. SSTs, objects, and smart pointers work out reasonable well.

At the end, I suggest two plausible solutions that seem workable to me at this point:

Thoughts on DST, Part 3

After posting part 2 of my DST series, I realized that I had focusing too much on the pure “type system” aspect and ignoring some of the more…mundane semantics, and in particular the impact of monomorphization. I realize now that – without some further changes – we would not be able to compile and execute the second proposal (which I will dub statically sized typed (SST) from here on out). Let me first explain the problem and then show how my first thoughts on how it might be addressed.

This is part 3 of a series:

  1. Examining DST.
  2. Examining an alternative formulation of the current system.
  3. …and then a twist.
  4. Further complications.

Thoughts on DST, Part 2

In the previous post I elaborated a bit on DSTs and how they could be created and used. I want to look a bit now at an alternate way to support the combination of vector types and smart pointers (e.g., RC<[uint]>). This approach avoids the use of DSTs. We’ll see that it also addresses some of the rough patches of DST, but doesn’t work quite as well for object types.

This is part 2 of a series:

  1. Examining DST.
  2. Examining an alternative formulation of the current system.
  3. …and then a twist.
  4. Further complications.

Thoughts on DST, Part 1

In the past, I’ve been quite the champion of dynamically sized types (DST). Specifically what this means is that things like [T] and Trait would be “types” in the Rust type system. Lately I’ve been investing a lot of effort thinking through the ramifications of offering better support for smart pointers, and in particular how this interacts with dynamically sized types, and I am no longer persuaded that DST offer the best way forward. I’m a bit unsure, though, and the topic is complicated, so I wanted to stop and write up a short series of posts laying out my thought process thus far. This post will describe what it would mean to offer DST in more detail. I don’t plan to give a lot of Rust background, since there’s enough to talk about.

This is part 1 of a series:

  1. Examining DST.
  2. Examining an alternative formulation of the current system.
  3. …and then a twist.
  4. Further complications.

Optimizing SIMD, Part 2

A quick follow-up to my previous post. The approach I suggested (“generate boxing instructions but bypass them when possible”) is in some sense pessimistic: we generate the instructions we need for the worst case and then cleanup. Like many problems in computer science, it has an optimistic dual. We could generate unboxed data and then insert boxes where needed. In fact, we have an existing mechanism for doing that, called the type policies. Basically, there is a phase where each MIR opcode goes through and examines the types of its inputs, attempting to reconcile those types with what it needs, either by boxing or unboxing as needed.

In a [comment on my original post][c], Haitao suggested using the more optimistic approach. It is also something I had discussed briefly with Luke in the past. I agree it is a good way to go; the main reason I held off on suggesting it is that I thought using the pessimistic approach (particularly to start) would be a faster way of getting going. Also, I wasn’t sure if having objects (the float32x4 wrappers) be represented by unpacked values during the IonBuilder phase would violate some invariants.

The reason that I think it’ll be easier to get started if we create the wrapped instructions to begin with is that they can be used in the bailouts. This means that we’ll be able to implement and test the required changes to the register allocator and code generator first. Of course, performance will be lackluster since we’ll still be allocating temporaries.

The pessimistic approach is also easier to implement. It requires only a few lines of code, basically just a helper for unboxing float32x4 values which first checks “is this a boxed float32x4? let me just pull out the input and return that directly rather than generating a load instruction”. However, it won’t yield as good results, since it doesn’t handle phis as well. Until Haitao suggested it, I hadn’t thought about using the optimistic approach to address phis.

In any case, either technique seems like a good choice to me. In the interest of making changes as small as possible, what I’d prefer is to start with the pessimistic approach and then move to the optimistic approach once things work.

If we take this approach (pessimistic first, optimistic later), then this is the high-level set of tasks. I used indentation to convey dependencies.

  • Add the simd module, either in C++ code or (preferably) ported to self-hosting. We have to be careful to get the float32 addition semantics here, but thanks to Math.fround that’s not too hard.
  • Add MIR types and operations, using pessimistic optimization to start
    • Augment register allocator and LIR to handle vector types
    • Augment bailout code to accept unboxed vectors and box them
      • Implement changes to type policy and generate unboxed data by default (“optimistic approach”)
  • Eventually, correct the value type semantics – right now we are representing float32x4 values as objects. The correct strategy for permitting new kinds of values (in the sense of a distinct typeof) is being discussed actively in bug 654416, and we will eventually latch onto whatever that strategy becomes.

Optimizing SIMD

There is currently some ongoing effort to implement the proposed JavaScript SIMD API in Firefox. The basic idea of the API is to introduce explicit vector value types called float32x4 and int32x4. These types fit into the typed objects hierarchy, so you can create arrays of them, embed them in structs, and so forth.

The semantics of these vectors types is designed to make it possible for JIT engines to detect and optimize their use. One crucial bit is that they are values and hence do not have identity. Basically float32x4 values work like numbers and strings do today – they are equal if they have the same value. This is quite different from objects, which may be unequal if their properties are the same (e.g., {} !== {}).

The purpose of this post is to outline one possible strategy for optimizing the use of SIMD types in IonMonkey. The initial set of optimizations don’t require any new analyses, though we can add an optional simple analysis in order to optimize more programs. This analysis is actually fairly general and can be applied to other kinds of typed objects.