The basic parallel model would be block-scoped parallelism as I’ve been
referring to in the various Rust posts. To make this concrete, here is a
low-level example of how it might work; this is a dynamic version of
something like the
finally/async blocks of X10. There are two basic concepts,
a parallel region and a set of parallel tasks. Each task is created within
some parallel region; when the parallel region is executed, all of
the tasks begin exeution. They may continue to create new tasks. Meanwhile,
the task which executed the parallel region blocks until it completes.
def divide_and_conquer(...): if "small enough to do sequentially": process(list) else: p = parallel_region() task1 = p.new_task(lambda: divide_and_conquer(...)) task2 = p.new_task(lambda: divide_and_conquer(...)) p.execute() task1.get_result() # yields an error unless p.execute() has run
I will leave aside for the moment the question of deciding which problems are small enough to do sequentially and so forth. This task and parallel region API is intended to be low-level. Higher-level libraries would build on it to make those sorts of decisions.
Anyway the API is somewhat orthogonal. We could dicker about the precise design, I am more interested in the data-race freedom part. The key properties that the API must maintain are:
- Tasks are always created within a parallel region.
- The parent task which executes the parallel region is always suspended while its child tasks execute.
Monitoring for data races
Now, the key idea: I want to say that each object is owned by the task which creates it. The data is then mutable only by its owner and any parent task of the owner. The data is readable by the owner and any child task of the owner. These constraints—assuming no global data—suffice to guarantee that a given object cannot leak to siblings of its owner. Only the owner, ancesors of the owner, and children of the owner can have access to the object.
This leaking property is important, so let’s take a second to see why
it’s true: the idea is that two sibling tasks can only communicate via
shared memory. This memory must have existed when the tasks were
created, so it must be owned by a common parent of the sibling tasks.
This common memory is therefore immutable: only the common parent could modify it, and the comment parent is suspended while its children execute.
Based on this, we can say that whenever we modify a field of an object, we must check that we are either the owner or a parent of the owner. If we are a parent, then the owner must have terminated (else the parent could be not be active), so we can adjust the owner of the object to be ourselves. Thus the check for “do I have write access?” is kind of a variant of Tarjan’s union set algorithm with chain compression.
Interestingly, no dynamic check at all is needed to do a read: either we own the object, in which case we can read it, or it is owned by a parent or extinct child. In all of those cases reads are permitted.
I’ve phrased this write check as a per-object check, but I think that in an actual implementation, I would really do it on groups of objects; this would be implemented almost exactly like a write barrier in a garbage collected environment, except that we have to not only write out the dirty bit but read it first to check that we have write access. Obviously this will be slower, but not that much slower I should think, particularly as we are likely to be writing to recently created objects, and hence have the dirty bit in cache.
What about other models?
I showed this for a simple fork-join model. I think you can rephrase it in terms of futures or something like Dot Net’s parallel tasks as well.