# Polonius and the case of the hereditary harrop predicate

21 January 2019

In my previous post about Polonius and subregion obligations, I
mentioned that there needs to be a follow-up to deal with
higher-ranked subregions. This post digs a bit more into what the
*problem* is in the first place and sketches out the general solution
I have in mind, but doesn’t give any concrete algorithms for it.

### The subset relation in Polonius is not enough

In my original post on Polonius, I assumed that when we computed a
subtype relation `T1 <: T2`

between two types, the result was either a
hard error or a set of `subset`

relations between various regions.
So, for example, if we had a subtype relation between two references:

```
&'a u32 <: &'b u32
```

the result would be a `subset`

relation `'a: 'b`

(or, “`'a`

contains a
subset of the loans in `'b`

”).

For a more complex case, consider the relationship of two fn types:

```
fn(&'a u32) <: fn(&'b u32)
// ^^^^^^^ ^^^^^^^^^^
// | A fn expecting a `&'b u32` as argument.
// |
// A fn expecting a `&'a u32` as argument.
```

If we imagine that we have some variable `f`

of type `fn(&'a u32)`

–
that is, a fn that can be called with a `'a`

reference – then this
subtype relation is saying that `f`

can be given the type `fn(&'b u32)`

– that is, a fn that can be called with a `'b`

reference. That
is fine so long as that `'b`

reference can be used as a `'a`

reference: that is, `&'b u32 <: &'a u32`

. So, we can say that the two
fn types are subtypes so long as `'b: 'a`

(note that the order is
reversed from the first example; this is because fn types are
*contravariant* in their argument type).

Unfortunately, this structure isn’t flexible enough to accommodate a subtyping question involving higher-ranked types. Consider a subtype relation like this:

```
fn(&'a u32) <: for<'b> fn(&'b u32)
// ^^^^^^^
// Unlike before, the supertype
// expects a reference with *any*
// lifetime as argument.
```

What subtype relation should come from this? We can’t say `'b: 'a`

as
before, because the lifetime `'b`

isn’t some specific region –
rather, the supertype says that the function has to accept a reference
with *any* lifetime `'b`

. In fact, this subtyping relation should
ultimately yield an error.

### Richer constraints

To express the constraints that arise from higher-ranked subtyping (and trait matching), we need a richer set of constraints than just subset. In fact, if you tease it all out, we need something more like this:

```
Constraint = Subset
| Constraint, Constraint // and
| forall<R1> { Constraint }
| exists<R1> { Constraint }
Subset = R1: R2
```

Now we can say that

```
fn(&'a u32) <: for<'b> fn(&'b u32)
```

holds if the constraint `forall<'b> { 'b: 'a }`

holds, which implies
that `'a`

has to contain all possible loans. This isn’t possible, and
so we would treat this as an error.

Interestingly, if we reverse the order of the two types:

```
for<'b> fn(&'b u32) <: fn(&'a u32)
```

we get the constraint `exists<'b> { 'a: 'b }`

(`for`

binders on the
*subtype* side are instantiated with “there exists”, not “for
all”). That is, the region `'a`

must be a subset of some possible set
of loans `'b`

. This constraint is trivially solveable: `'b`

could
always be exactly `'a`

itself.

### The role of free vs bound regions

As one final example, consider what happens here, where we added a
return type and another region (`'c`

):

```
for<'b> fn(&'b u32) -> &'b u32
<:
fn(&'a u32) -> &'c u32
```

This gives rise to the following constraint:

```
exists<'b> {
'a: 'b, // from relating the parameter types
'b: 'c, // from relating the return types
}
```

Here, the constraint is solveable, but only if `'a: 'c`

. Therefore, if
we think back to Polonius with its simple “subset” relations, we can
effective *reduce* this “rich” constraint to the subset relation `'a: 'c`

.

To do this reduction, we draw a distinction between the *bound* and
the *free* regions. *Bound* regions are those that are bound within a
`forall`

and `exists`

quantifiers (e.g., `'b`

), and *free* regions
those that are not. When we are reducing, we only care about two things:

**Do we have something**This often happens when a bound “forall” region is on the right-hand side.*unsatisfiable*about the constraint?- We saw this with
`forall<'b> { 'b: 'a }`

. - Another example is
`forall<'x, 'y> { 'x: 'y }`

. - Reducing something unsatisfiable is obviously an error.

- We saw this with
**What are the effects on the free regions?**Othertimes, bound regions effectively as a “go-between”, creating subset relations between the free regions.- We saw this with
`exists<'b> { 'a: 'b, 'b: 'c }`

. - Another, stranger example might be
`forall<'b> { 'a: 'b }`

: here, this is satisfiable, but only if`'a: 'static`

. This is true because`'static: 'b`

is implicitly true (`'static: R`

is true for any region R).- In Polonius terms,
`'static`

represents an “empty set” of loans, so this effectively means that`'a`

can be a subset of any region`'b`

by being the empty set.

- In Polonius terms,

- We saw this with

### Wait, those “richer” constraints look familiar…

The “richer” constraints I mentioned in the previous section basically
arise from taking a base predicate (`R1: R2`

) and “adding in” richer
constraint forms like “for all” and “there exists”. This may sound
familiar – if you recall my very first Chalk post, I talked
about the need to go beyond Prolog’s core “Horn clauses” and to
support “Hereditary Harrop” (HH) predicates. The basic idea was
to extend simple Horn clauses with “for all” and “there exists”, along
with a few other things.

In fact, “hereditary harrop predicates” are a kind of generic structure that we can apply to any base set of predicates. So, if we wanted, we might say that the region constraints we are creating can be extended to the full hereditary harrop form, which would look like so:

```
Constraint = Subset
| Constraint, Constraint // and
| Constraint; Constraint // or
| forall<R1> { Constraint }
| exists<R1> { Constraint }
| if (Assumption) { Constraint }
Assumption = Subset
| forall<R1> { Assumption }
| if (Constraint) { Assumption }
Subset = R1: R2
```

Here we support not only “for all” and “there exists” but also “implication” and even “or”. rustc doesn’t use constraints this rich today, but for various reasons I think we will want to eventually.

Why is it useful to talk about HH predicates? Well, HH predicates have the nice property that we can use basic Prolog-style search to find and enumerate all possible solutions to them. Besides, “hereditary harrop” is really fun to say.

### Conclusion

So now we have this problem. To encode the “solutions” to
higher-ranked subtyping and trait matching, we need to use this richer
notion of constraints that include `forall`

and `exists`

quantifiers. Once we add those, we are basically talking about
“hereditary harrop region constraints”. We’ve also talked about the
idea of mapping these complex constraints down to the simple subset
relation that Polonius uses, but here I only gave examples and didn’t
really give any sort of *algorithm*. I’ve done some experiments here,
and I may try to write them up in a future post, but I’m also curious
to know if somebody else has already solved this problem. I definitely
have that “reinventing the wheel” feeling here.

One really *nice* aspect of this general direction, though, is that it
means that Polonius effectively doesn’t care about these “richer”
constraints. The idea is that our subtyping and trait matching
algorithms can produce hereditary harrop region constraints (or some
subset thereof). These can be reduced to simpler subset constraints,
which are then passed to Polonius to do the final reasoning. (And, of
course, any of these steps may also produce an error.)

Comments, as usual, are requested in the internals thread for this blog post series.