31 January 2018
In my last Chalk post, I talked about an experimental, SLG-based
solver that I wrote for Chalk. That particular design was based very
closely on the excellent paper
“Efficient top-down computation of queries under the well-founded semantics”, by W. Chen, T. Swift, and D. Warren. It
followed a traditional Prolog execution model: this has a lot of
strengths, but it probably wasn’t really suitable for use in rustc.
The single biggest reason for this was that it didn’t really know when
to stop: given a query like exists<T> { T: Sized }, it would happily
try to enumerate all sized types in the system. It was also pretty
non-obvious to me how to extend that system with things like
co-inductive predicates (needed for auto traits) and a few other
peculiarities of Rust.
read more →
21 October 2017
read more →
25 May 2017
For my next post discussing chalk, I want to take kind of a
different turn. I want to talk about the general struct of chalk
queries and how chalk handles them right now. (If you’ve never heard
of chalk, it’s sort of “reference implementation” for Rust’s trait
system, as well as an attempt to describe Rust’s trait system in terms
of its logical underpinnings; see
this post for an introduction to the big idea.)
read more →
23 April 2017
In my previous post, I talked over the basics of how
unification works and showed how that “mathematical version” winds
up being expressed in chalk. I want to go a bit further now and extend
that base system to cover associated types. These turn out to be a
pretty non-trival extension.
What is an associated type?
If you’re not a Rust programmer, you may not be familiar with the term
“associated type” (although many langages have equivalents). The basic
idea is that traits can have type members associated with them. I
find the most intuitive example to be the Iterator trait, which has
an associated type Item. This type corresponds to kind of elements
that are produced by the iterator:
read more →
25 March 2017
So in my first post on chalk, I mentioned that unification and
normalization of associated types were interesting topics. I’m going
to write a two-part blog post series covering that. This first part
begins with an overview of how ordinary type unification works during
compilation. The next post will add in associated types and we can see
what kinds of mischief they bring with them.
What is unification?
Let’s start with a brief overview of what unification is. When you are
doing type-checking or trait-checking, it often happens that you wind
up with types that you don’t know yet. For example, the user might
write None – you know that this has type Option<T>, but you don’t
know what that type T is. To handle this, the compiler will create a
type variable. This basically represents an unknown,
to-be-determined type. To denote this, I’ll write Option<?T>, where
the leading question mark indicates a variable.
read more →
26 January 2017
Over the last year or two (man, it’s scary how time flies), I’ve been
doing quite a lot of thinking about Rust’s trait system. I’ve been
looking for a way to correct a number of flaws and shortcomings in the
current implementation, not the least of which is that it’s
performance is not that great. But also, I’ve been wanting to get a
relatively clear, normative definition of how the trait system works,
so that we can better judge possible extensions. After a number of
false starts, I think I’m finally getting somewhere, so I wanted to
start writing about it.
read more →
24 August 2012
One of the things that is sometimes frustrating in Rust is the
inability to define a type that indicates some subset of enum
variants. For example, it is very common to have a pattern like this:
match an_expr {
expr_call(*) => process_call(an_expr)
...
}
fn process_call(a_call_expr: @ast::expr) { ... }
But as you can see, the type of a_call_expr does not reflect the
fact that this expression is a call. This is frustrating.
read more →
30 July 2012
I’ve been slowly learning how type inference works in
SpiderMonkey. As I understand it, SpiderMonkey’s type inference
scheme is the brain child of one Brian “Hack-it”, coder
extraordinaire. You may have seen a recent PLDI publication
on the topic. You may, like me, have read that publication. You may,
also like me, have walked away thinking, “um, I don’t really
understand how that works.” In that case, dear reader, this blog post
is for you. Well, actually, it’s for me, to try and document what I
gleaned from a conversation or two. It it is almost certainly not
entirely accurate and it may or may not be helpful.
read more →
13 July 2012
I had a very interesting discussion with Sriram and Terrence (of
Kilim and ANTLR fame, respectively—two smart
dudes) yesterday. One of the things we talked about was adapting
shared-memory data structures like concurrent hash maps into
an actor setting.
One thing we’ve found when working on Servo is that the temptation to
cheat is enormous. Most of the papers you read about things like
parallel layout just assume a shared memory setting and blithely make
use of data strutures like concurrent hash maps. There is nothing
wrong with such data structures, but if we can avoid shared, mutable
memory it will go a long way towards avoiding bugs I think—as well
as keeping things secure. Even if the bug is mostly correct, data
races and similar subtle errors can open holes for exploitation.
read more →
11 June 2012
Last Thursday and Friday I had the good fortune of presenting a paper
of mine at HotPar 2012. The paper is called
Parallel Closures: A New Twist on an Old Idea; it basically
describes what has evolved to become the PJs (Parallel JavaScript)
model, though it does so in the context of a static checker built in
Java.
I really enjoyed the workshop: the presenters were generally very good
and the audience was lively. I also appreciate that they make an
effort to encourage good conversation; for example, at lunchtime the
tables are labeled with topics for discussion (“memory models”, say,
or, “schedulers”). It’s always hard for young folk like myself to get
connected with the older, more knowledgable people in the audience, so
everything helps. (“Hi Hans Boehm, love your work on C++ memory
models”)
read more →
28 May 2012
I am dissatisfied with how mutability is treated in the Rust type
system. The current system is that a type is not prefixed mutable;
rather, lvalues are. That is, a type T is defined like so:
T = [M T]
| @ M T
| & M T
| { (M f : T)* }
| int
| uint
| ...
M = mut | const | (imm)
Note that there is no type mut int (a mutable integer). This is
logical enough; such a type has little inherent meaning: an integer is
a value, it is not mutable or immutable.
read more →
15 April 2012
For a long time, it was considered fairly obvious, I think, that
syntax didn’t really matter. It was just the surface skin over the
underlying ideas. In recent times, though, the prevailing wisdom has
reversed, and it is now quite common to hear people talk about how
“syntax matters”.
While I don’t exactly disagree, I think that the importance of trivial
syntactic matters is generally overemphasized. It is not a matter of
life and death whether or not semicolons are required to end a line,
for example, or whether parentheses are required in making a call.
read more →
28 March 2012
pcwalton and I (but mostly pcwalton) have been hard at work
implementing regions in Rust. We are hoping to use regions to avoid a
lot of memory allocation overhead in the compiler—the idea is to use
memory pools (a.k.a. arenas) so that we can cheaply allocate the data
needed to process a given function and then release it all in one
shot. It is well known that arenas are great fit for the memory
allocation patterns of a compiler, which tend to produce a lot of data
that lives for the duration of a pass but is not needed afterwards.
read more →
16 February 2012
One commonly requested feature for regions is the ability to return
references to the inside of structures. I did not allow that in the
proposal in my previous post because I did not want to have any
region annotations beyond a simple &. I think, however, that if you
want to allow returning references to the interior of a parameter, you
need a way for the user to denote region names explicitly.
read more →
15 February 2012
I was talking to brson today about the possibility of moving Rust to a
regions system. He pointed out that the complexity costs may be high.
I was trying to make a slimmer version where explicit region names
were never required. This is what I came up with. The truth is, it’s
not that different from the original: adding back region names wouldn’t
change much. But I’m posting it anyway because it includes a description
of how to handle regions in types and I think it’s the most complete and
correct proposal at the moment.
read more →
14 February 2012
Brian pointed out to me a nice solution to the Task API problem that I
have overlooked, though it’s fairly obvious. Basically, I had
rejected a “builder” style API for tasks because there is often a need
for the child task to be able to send some data back to its parent
after it has been spawned, and a builder API cannot easily accommodate
this. Brian’s idea was to encapsulate these using futures. It’s
still not perfect but it’s better I think and more composable than my
first, limited proposal. It still requires that the actor pattern be
a separate module.
read more →
1 February 2012
It’s been a while since I wrote anything on the blog! A lot has been
going on in the meantime, both in Rust, parallel JavaScript, and
personally…I hate to write a big update post but I gotta’ catch up
somehow!
Rust
First, we made our 0.1 release, which is great. We are now planning
for 0.2. The goal is to make frequent, relatively regular releases.
We’re still in such an early phase that it doesn’t seem to make sense
to literally release every few months, but at the same time we don’t
plan to wait long.
read more →
11 January 2012
In one of the comments on yesterday’s post,
Tushar Pokle asked why I would champion my model over an
Erlang model of strict data separation. There are several answers to
this question. The simplest answer is that Web Workers already
provide an actors model, though they do not make tasks particularly
cheap (it’s possible to work around this by creating a fixed number of
workers and sending tasks for them to execute).
read more →
9 January 2012
Lately the ideas for a parallel, shared memory JavaScript have begun
to take shape. I’ve been discussing with various
JavaScript luminaries and it seems like a
design is starting to emerge. This post serves as a documentation of
the basic ideas; I’m sure the details will change as we go along.
User Model
The model is that a JavaScript worker (the “parent”) may spawn a
number of child tasks (the “children”). The parent is suspended while
the children execute, meaning that it will not process events or take
other actions. Once the children have completed the parent will be
re-awoken.
read more →
29 December 2011
The original Rust design included iterators very similar to Python’s
generators. As I understand it, these were stripped out in favor of
Ruby-esque blocks, partially because nobody could agree on the best
way to implement iterators. I like blocks, but it seems like it’s
more natural to compose iterators, so I wanted to think a bit about
how one might use blocks to achieve similar things. I’m sure this is
nothing new; there must be hundreds of libraries in Haskell that do
the same things I’m talking about here.
read more →
20 December 2011
In the context of thinking about parallelism for Rust, I have been reminded
of an older idea I had for a lightweight, predictable dynamic race
detection monitoring system based around block-scoped parallelism. I should
think this would be suitable for (an extended version of) a dynamic
language like Python, JavaScript, or Lua. I will write in a Python-like
syntax since I know it best, but I am debating about exploring this
for JavaScript.
read more →
9 December 2011
I’ve been thinking a lot about “parallel blocks” recently and I am
beginning to think they can be made to work very simply. The main
thing that is needed is a type qualifier const that means
“read-only”. This would be a type prefix with very low precedence,
just like immutable and shared in D. The type const T
would refer to an instance of T that cannot be modified. This is a
deep property, so, given some record types defined like:
read more →
8 December 2011
Marijn asked me what it is that I dislike about parameter
modes. I thought I might as well explain here.
For background, today in Rust a function can declare each parameter in
one of several modes:
- By value (
++): No pointer is used but the value is not owned by the
callee. Therefore, the callee does not need to free it, for example, or
decrement a ref count. - By immutable reference (
&&): a pointer to the variable in the caller’s
stack frame is passed, but the callee cannot use it to make changes.
Can be passed an lvalue or an rvalue. - By mutable reference (
&): a pointer to the variable in the caller’s
stack frame is passed, and the callee can use it to reassign the variable.
Can only be passed an lvalue. - By copy (
+): A fresh copy of the value is created and the callee must
dispose of it. - By move (
-): The value is moved from the caller’s stack frame and the
callee must dispose of it.
So what don’t I like about modes?
read more →
2 December 2011
One of the better features from functional programming languages are
variant types (a.k.a. algebraic data types). Basically they are a way
of enumerating a small set of possibilities and then making sure that
you handle every possible case. However, in real world use variant
types tend to run into a few annoying problems. While working on the
Harmonic compiler, I found that
Scala’s case classes addressed some of these shortcomings.
My goal in writing Scala code was to never have an assert false to
cover situations I knew could not occur. I did not quite succeed, but
I got really close, much closer than I ever got in any other language.
Mostly where I failed I knew that I could refactor the types but I did
not want to spend the time to do it. In this post I want to explain
how and why the case class approach seems to work better than
traditional variant types. In later posts I’ll cover some of the
other tricks that I ended up using, particularly the approach I used
to having an AST whose shape changed over time.
read more →