More on fns

3 June 2013

I have been thinking about my previous proposal for fn types. I wanted to offer some refinements and further thoughts.

On Thunks

I proposed a trait Task for encapsulating a function and the parameters it needs to run. I don’t like this name because this concept could be used in other places beyond just tasks. I was thinking that the proper name is probably Thunk. I quote Wikipedia for the definition of Thunk: “In computer science, a thunk (also suspension, suspended computation or delayed computation) is a parameterless closure created to prevent the evaluation of an expression until forced at a later time.” (There are, admittedly, other contrary uses for the term)

Sugar for thunks

One of the most common criticisms that I heard of the proposal was that people did not like the idea of having to name the variables that are copying into a thunk. I can understand this, though I personally value the clarity of saying “spawn a task, taking these values out of the containing environment and into the task”. That said, if this is the stumbling block, we could integrate thunks a bit deeper into the language’s surface syntax, if not the type system, to avoid this issue.

Basically, you could imagine a keyword thunk { ... } that does precisely what my thunk!(...) macro did, except that it automatically determines the captured variables, rather than requiring they be listed.

Similarly, the do syntax could be extended to so that it can be used with a function that expects either a fn or a [sigil] Thunk object. If the function expects a fn, then do func { ... } would be equivalent to func(|| ...). If the function expects a [sigil] Thunk, then do func { ... } would be equivalent to func([sigil]thunk { ... }). This would be that we could continue to write the “pleasingly ambiguous” do spawn { ... } syntax, which is popular despite hiding allocations and makes the meaning of code dependent on the type of the callee (/me pouts).

Once functions

I haven’t made up my mind whether it makes sense to include once fn or not. On the one hand, as I showed, you can workaround it in a mechanical fashion, and I think it’s wise to avoid “mission creep” in the type system. On the other, the workaround is somewhat clumsy, and probably not something you want to do all that often.

To try and get a better feeling for how often once fns would be needed, I did a survey of all closures that appear in the core library. I found that the vast majority were executed multiple times. There were however a couple of common patterns where once fns appeared.

The pattern I had most in mind with once fns is setup/teardown functions, like task::unkillable, which use a closure to indicate what should be done between the setup and teardown. At least in the standard library, these functions are not that common, but they do occur.

Another common example is the with pattern, where you have a callback that is invoked with a reference to the data inside your object. This is relatively unusual now that we have lifetimes, it only occurs in a few specialized cases like locks, where extra action is needed before and after the with.

There was one example where once fn would be useful that stood out to me, however. Ironically, it is something that I myself proposed in a recent e-mail to rust-dev, where I suggested that we should encode “default” arguments using closures. So, for example, a function like Option’s get_or_default, which is currently written:

fn get_or_default<T>(opt: Option<T>, default: T) -> T

would take a closure instead:

fn get_or_default<T>(opt: Option<T>, default: fn() -> T) -> T

However, without once fns, this is in fact somewhat less flexible. To see what I mean, imagine trying to implement the old behavior (taking a value) on top of the proposed behavior (taking a closure):

fn get_or_default_value<T>(opt: Option<T>, default: T) -> T {
    get_or_default(opt, || default)
}

If you try to compile this, the compiler will signal an error because the || default closure is moving the argument default, and this is not legal within a closure because the closure may execute multiple times. Therefore, to write this wrapper, one would have to either copy the default value or use a cell, which is very unsatisfying. I only realized this problem would occur when kballard encountered it when refactoring the methods on hashmap.

Of course, it would be possible to write get_or_default using the pattern I proposed, in which we pass an extra argument to carry any moved values:

fn get_or_default<T,A>(opt: Option<T>, arg: A, default: fn(A) -> T) -> T

In that case, one could write the wrapper easily enough, although it is a bit repetitive:

fn get_or_default_value<T>(opt: Option<T>, default: T) -> T {
    get_or_default(opt, default, |default| default)
}

This would however mean that the common case, where nothing is moved, gets more verbose:

get_or_default(opt, (), |()| do_something())

So I find myself unsure about whether to include once fns or not. I don’t mind having to write this “pass an extra argument” pattern once in a while, but I do expect the default pattern to come up somewhat regularly in the standard library (certainly it occurs with options and maps, not sure where else). Ideally I’d prefer to defer once fns for “post Rust 1.0”, but it might be a bit unfortunate if we wound up with standard interfaces that include extra argument parameters that later became unnecessary. This is particularly true since once fn aren’t complicated to implement.

Expressing once fns with thunks

You might think that you could replace once fns with thunks, since they too are run-once. This is particularly appealing if we opted for thunk sugar. However, the main problem here is the question of a copying vs by reference closure. The way I described thunks, they were always copying the values they close over. So if you wrote:

get_or_default(opt, &thunk { do_something(a, b) })

This would copy/move a and b from the enclosing stack frame into the thunk. Of course, you can make a thunk that takes a reference into the enclosing stack frame, just by capturing a reference:

let r_a = &a;
let r_b = &b;
get_or_default(opt, &thunk { do_something(r_a, r_b) })

But this is rather inconvenient. So we’d presumably want to extend our thunk sugar to do this automatically. But then we’d need to distinguish between by reference thunks and copying thunks. We’d probably use the & sigil here, which would in turn mean that if you use the do syntax it’s not clear whether references are copying or by reference. This is all…tolerable, but it winds up feeling a lot more complicated to me than saying “thunks copy/move the values they close over, fns takes them by reference, a once fn can only be called once.”

Type names bikeshed

I mentioned in passing that I am not a fan of the extern fn name. It was suggested that perhaps raw function pointers should be fn and closures should be something else, such as closure(S) -> T, proc(S) -> T, or even |S| -> T. This is pure bikeshedding, but it has some appeal, since the type of a fn item would be fn(S) -> T, the type of an extern fn would be extern "C" fn(S) -> T, and the type of a closure would be, well, whatever it is. The main downside is that closure types are what you want most of the type, and fn(S) -> T is so nice and concise. Also, we’d need to find a place to put the closure bounds.