Avoiding region explosion in Rust

28 March 2012

pcwalton and I (but mostly pcwalton) have been hard at work implementing regions in Rust. We are hoping to use regions to avoid a lot of memory allocation overhead in the compiler—the idea is to use memory pools (a.k.a. arenas) so that we can cheaply allocate the data needed to process a given function and then release it all in one shot. It is well known that arenas are great fit for the memory allocation patterns of a compiler, which tend to produce a lot of data that lives for the duration of a pass but is not needed afterwards.

In any case, recently we had a discussion about how we can use regions in the trans pass of the compiler: this is the pass which converts from our internal representation (IR) to the LLVM’s IR. I thought it was worth sharing the result of this discussion. The basic summary is that we are able to make use of region subtyping to accommodate a fairly complex pattern of lifetimes with very little annotation overhead.

The setting: contexts in trans

First, let me introduce the problem: during translation, we produce a lot “contexts”, which store needed data about the state of the translation. For our purposes, there are three contexts of note: the crate context, or ccx, which stores crate-wide data such as linkage information about top-level functions, constants, and so forth; the function context, or fcx, which stores per-function data such as references to the LLVM variables representing its parameters and locals; and finally the block context, or bcx, which stores information about a single basic block in the control-flow graph.

What we would like to do is to create the crate context ccx on the stack when we enter the translation phase for the crate as a whole. Later, when we begin to translate a given function, we will allocate its function context fcx on the stack as well. The block contexts, however, are a little different: they do not fully obey a stack discipline. That is, it is common for a function to create a new block context and return it to its caller, perhaps with a signature like the following:

fn compile_if_then_else(bcx0: @block_ctxt,
                        cond: @expr,
                        then_blk: @code_block,
                        else_blk: @code_block) -> @block_ctxt

This function would presumably generate the diamond-shaped if-then-else pattern. The initial block is the block represented by bcx0. The function will compile the condition cond and generate branch to one of two new basic blocks representing the true and false paths. The code might look something like this (note: this is not the actual code in rustc, which is naturally much messier):

let (bcx1, val) = compile_expr(bcx0, cond);
let mut bcx_true = new_bcx(bcx0.fcx);
let mut bcx_false = new_bcx(bcx0.fcx);
add_instr(bcx1, if(val, bcx_true, bcx_false));

The then and else blocks could then be compiled in the contexts of those true and false blocks:

bcx_true = compile_block(bcx_true, then_blk);
bcx_false = compile_block(bcx_false, else_blk);

And finally the two paths can be merged into a new block, which is the block that gets returned:

let bcx_join = new_bcx();
add_instr(bcx_true, goto(bcx_join));
add_instr(bcx_false, goto(bcx_join));
ret bcx_join;

The problem: expressing context lifetimes with regions

Let’s dig a bit more into the representation of these contexts. The details aren’t too important but I want to focus on the region-related aspects that describe their lifetimes. Remember that there is a crate context ccx that is valid for the translation of the entire crate. Its contents are not important, so let’s just assume it’s some record:

type crate_ctxt = {
     ...
};

Then there is a function context. It contains a pointer to the crate context, along with some other stuff:

type func_ctxt = {
    ccx: &crate_ctxt,
    ...
};

Finally the block context, which contains a pointer to the function context:

type block_ctxt = {
    fcx: &func_ctxt,
    ...
};

Here I have shown the pointers as region pointers, but I haven’t written any explicit region annotations. The question is, what regions should we associate with those pointers?

The maximally expressive approach

If you wanted to take the maximally expressive approach, you would wind up with a lot of region parameters. For now I will show this in a very explicit syntax in which types are given explicit Region parameters, but this syntax is not valid Rust and (hopefully) never will be:

type crate_ctxt = {
     ...
};

type func_ctxt<&c> = {
    ccx: &c.crate_ctxt,
    ...
};

type block_ctxt<&f,&c> = {
    fcx: &f.func_ctxt<&c>,
    ...
};

You can see the problem. The type for the block context must be annotated with two region parameters, one to describe the region of the function context and one for the crate context.

In this technique, if we have a variable bcx of type &b.bcx<&f,&c>, then bcx.fcx.ccx will have type &c.ccx: the precisely correct region, presumably.

For reference, the signature of compile_if_then_else() would become:

fn compile_if_then_else(bcx0: &b.block_ctxt<&f,&c>,
                        cond: @expr,
                        then_blk: @code_block,
                        else_blk: @code_block) -> &b.block_ctxt<&f,&c>

The minimally expressive approach

The approach we plan to take is much simpler. Types do not have region parameters. Instead, when we instantiate an &T type to a specific region, the outermost & in a function prototype is assigned a fresh region, but & which appear within that type are assigned to this same fresh region. This means that if we have a variable bcx with type &b.bcx, then bcx.fcx.ccx will have type &b.ccx: this is an underapproximation of the lifetime of the crate context. The true lifetime is &c which is some superset of &b. The reason that this whole scheme type checks is because of the subtyping relationships between region pointers: a reference with a longer lifetime (like &c.ccx) can be used wherever a reference with a shorter lifetime (like &b.ccx) is expected.

Under this approach, the signature of compile_if_then_else() becomes:

fn compile_if_then_else(bcx0: &b.block_ctxt,
                        cond: @expr,
                        then_blk: @code_block,
                        else_blk: @code_block) -> &b.block_ctxt

Not so bad.

Arenas and placement new

One question remains: because the lifetime of block contexts is not bound by the call stack, how can we manage their allocation without resorting to heap allocation (the function context and crate context can be allocated on the stack)? The answer is that we will use arenas.

An arena is basically a pool of memory in which we can allocate lots of data and then release the pool all in one shot. This is very cheap but only suitable for places where allocation follows a “phase-based” pattern.

We will use a memory pool which is allocated and released per-function. Therefore, the pool itself will be stored in the function context:

type func_ctxt = {
    ccx: &crate_ctxt,
    pool: &memory_pool,
    ...
};

In the current Rust type system, anyhow, a memory pool can be any type for which there exists an impl offering an alloc(sz: uint, align: uint) -> *() method, which allocates sz bytes of memory at the given alignment and returns a pointer. An expression like new (pool) value will cause pool.alloc() to be invoked and will then store the value into the memory location that was returned. The result is a region pointer in the same region as the pool itself.

This means that allocating a new block context looks something like:

fn new_bcx(fcx: &f.func_ctxt) -> &f.func_ctxt {
    new (fcx.pool) {fcx: fcx, ...}        
}

The Summary

The basic idea of the approach is to retain less information. For a given region pointer p, all you know is that any data reachable via some path like p.f.g.h will be live as long as p is live. It seems that this is enough in practice for most real use cases. Time will tell, I suppose.