Why not modes?

8 December 2011

Marijn asked me what it is that I dislike about parameter modes. I thought I might as well explain here.

For background, today in Rust a function can declare each parameter in one of several modes:

By value (++): No pointer is used but the value is not owned by the callee. Therefore, the callee does not need to free it, for example, or decrement a ref count.
By immutable reference (&&): a pointer to the variable in the caller’s stack frame is passed, but the callee cannot use it to make changes. Can be passed an lvalue or an rvalue.
By mutable reference (&): a pointer to the variable in the caller’s stack frame is passed, and the callee can use it to reassign the variable. Can only be passed an lvalue.
By copy (+): A fresh copy of the value is created and the callee must dispose of it.
By move (-): The value is moved from the caller’s stack frame and the callee must dispose of it.

So what don’t I like about modes?

Modes are invisible for the caller

The caller of a function cannot tell whether the parameter that is being passed is being passed by reference or by value. For example, quick, what does this function (from linux_os.rs) return:

fn waitpid(pid: pid_t) -> i32 {
    let status = 0i32;
    os::libc::waitpid(pid, status, 0i32);
    ret status;
}

When I first read it, I thought it must always return 0. But in fact the function os::libc::waitpid() is defined with its second parameter as a mutable reference, and so it can modify the value of status.

I much prefer the C convention of passing a pointer. In that case, the function above would be written:

fn waitpid(pid: pid_t) -> i32 {
    let status = 0i32;
    os::libc::waitpid(pid, &status, 0i32);
    ret status;
}

Now it is clear that status might be modified by waitpid().

Copy and move modes divide responsibility in a strange way

Both the copy and move modes specify that the callee is responsible for disposing of the argument. It makes good sense for the callee to declare that it will free the value provided as an argument. However, I do not understand why it is any of the callee’s business, however, whether the caller chose to provide the value by copying or moving it. This seems like a decision the caller is better suited to make.

Modes do not compose

Finally, having extra information about how the parameter is passed that is not part of the type makes it impossible to write generic functions that operate over functions with any argument. Consider a generic function timer:

fn timer<A>(f: fn(A), arg: A) {
    let t_start = get_current_time();
    f(arg);
    let t_stop = get_current_time();
    log (t_stop - t_start);
}

Seems simple enough. Now, you might ask, what if I wanted to use timer() with a function that takes two arguments, like foo():

type T = {...}; // some record type
fn foo(&t1: T, &t2: T) { ... }

This won’t work, because timer() expects a function of only one argument. But wait, with generic types we could write a little wrapper:

fn wrap2<A,B>(f: fn(A,B)) -> fn((A,B)) {
    ret lambda (pair: (A,B)) {
        let (a, b) = pair;
        f(a, b);
    };
}

And now we can replace a call like foo(v1, v2) with timer(wrap2(foo), (v1, v2)), right? Well, that’s true, but the behavior is slightly different. In the original, t1 was passed pointers to v1 and v2, whereas now it is being passed pointers to the a and b temporaries. Not only that, but copies of v1 and v1 are occurring!

If we had used something like regions, then foo() would be defined:

fn foo(t1: &T, t2: &T) { ... }

and we could replace the call foo(&v1, &v2) with timer(wrap2(foo), (&v1, &v2)) with no change in the semantics.

So what could we do instead?

I’d rather see modes move into types. For example, by-reference (both mutable and immutable) can become pointer types, as in C. By value is basically unnecessary. To handle move and copy mode, you say that types like ~T or resources are always owned by the callee, then let the caller decide whether to move its value or copy it.