Unifying patterns in alts and lets
10 June 2012
This is a proposal to unify the mechanics of alt
and destructuring
assignment. It was born out of discussion between erickt, pcwalton,
and I amidst various bugs in the bug tracker but I wanted to float it
around to a larger audience. I’d like to discuss this on Tuesday,
because one of the logical next steps for the regions work is to begin
deciding precisely what to do about the types of identifiers in alts.
Today
Currently, alt
always creates implicit references into the structure
you are alting over. For example, in this code:
let p = {x:1, y:2};
alt p {
{x: x, y: y} {
...
}
}
the bound variables x
and y
are actually pointers to the interior
of p
.
In addition, one can use let to match infallible patterns:
let p = {x: 3, y: 4};
let {x: x, y: y} = p;
Here, however, the values are actually not pointers to the interior of
p
but rather are copied out of p
.
Shortcomings
Sometimes it is useful to get a pointer to the interior of a pattern
in a let
and sometimes it is useful to copy out in an alt
, but
the current system does not let you choose. In addition, it is often
very useful to move out of the discriminant in an alt
, but that is
not currently an option.
The matter of copying out of an alt
is somewhat more important under
the new borrowck rules. This is because the older system would
implicitly copy out of the discriminant when it appeared that the
value being matched was residing in mutable memory or that it might be
invalidated in some way. This is no longer the case, which means that
more explicit copies are required in order to match against the
contents of an enum or unique pointer that lives in mutable memory (I
am actively working on a blog post / tutorial about the details of
this new check).
The proposal
The proposal is to distinguish between copying bindings and
reference bindings. A copying binding, indicating by either a
variable name alone (x
) copies/moves the value out of the
discriminant. A reference binding, indicated using *x
(see some
notes on syntax below), takes the address of the value within the
discriminant. For types that are not implicitly copyable, copying
bindings must be preceded by a copy
keyword (copy x
).
Here is an example of creating references into the interior:
let p = {x:1, y:2};
alt p {
{x: *x, y: *y} {
...
}
}
And the same example using let
:
let p = {x:1, y:2};
let {x: *x, y: *y} = p;
Here is an example of copying the values out:
let p = {x:1, y:2};
alt p {
{x: x, y: y} {
...
}
}
let {x: x, y: y} = p;
And finally an example that requires an explicit copy
keyword:
let p = {x: ~1, y: ~2};
alt p {
{x: copy x, y: copy y} {
....
}
}
let {x: copy x, y: copy y} = p;
Here, a pattern like {x: x, y: y}
would result in a warning because
a unique value is being copied (which requires memory allocation and
is a performance red-flag).
Moves
As a bonus, this idea transparently permits data to be moved as part
of an alt
(hat tip to pcwalton for this observation). For example,
the function called option::unwrap()
could be written as follows
(here I am assuming a unary move operator; something generally agreed
to but not yet implemened):
fn unwrap<T>(-opt: option<T>) -> T {
alt move opt {
some(v) { ret v; }
none { fail; }
}
}
Basically, if the discriminant is moved into the alt then its pieces can be carved up and moved into the bindings. This is equivalent to lets like the following (which is legal today):
let (x, y) = move v;
For symmetry, the move
keyword could be permitted on copying
bindings (move x
). It seems though that this would always be
superfluous except in the case of last use, where it could serve as
useful documentation:
fn unwrap<T>(-opt: option<T>) -> T {
alt opt {
some(move v) { ret v; }
none { fail; }
}
}
Syntax
I borrowed the *identifier
syntax from Cyclone. However, I would
personally prefer &identifier
, as it is more reminiscent of the
“take the address of” operator. However, I presume that &P
will
eventually become a pattern, like @P
and ~P
today (currently,
there is no pattern to match against an &T
type). There was some
talk at various points of making unsafe pointers be a special kind of
lifetime, like static
, so that one would write *unsafe T
in which
case *r.T
could replace &r.T
as the type of safe references. That
would in turn permit switching the role of &
and *
in patterns.