Polonius revisited, part 2
29 September 2023
In the previous Polonius post, we formulated the original borrow checker in a Polonius-like style. In this post, we are going to explore how we can extend that formulation to be flow-sensitive. In so doing, we will enable the original Polonius goals, but also overcome some of its shortcomings. I believe this formulation is also more amenable to efficient implementation. As I’ll cover at the end, though, I do find myself wondering if there’s still more room for improvement.
Running example
We will be working from the same Rust example as the original post, but focusing especially on the mutation in the false
branch1:
let mut x = 22;
let mut y = 44;
let mut p: &'0 u32 = &x;
y += 1;
let mut q: &'1 u32 = &y; // Borrow `y` here (L1)
if something() {
p = q; // Store borrow into `p`
x += 1;
} else {
y += 1; // Mutate `y` on `false` branch
}
y += 1;
read_value(p); // May refer to `x` or `y`
There is no reason to have an error on this line. There is a borrow of y
, but on the false
branch that borrow is only stored in q
, and q
will never be read again. So there cannot be undefined behavior (UB).
Existing borrow checker flags an error
The existing borrow checker, however, is not that smart. It sees read_value(p)
at the end and, because that line could potentially read x
or y
, it flags the y += 1
as an error. When expressed this way, maybe you can have some sympathy for the poor borrow checker – it’s not an unreasonable conclusion! But it’s wrong.
The core issue of the existing borrow check stems from its use of a flow insensitive subset graph. This in turn is related to how it does the type check. In Polonius today, each variable has a single type and hence a single origin (e.g., q: &'1 u32
). This causes us to conflate all the possible loans that the variable may refer to throughout execution. And yet as we have seen, this information is actually flow dependent.
The borrow checker today is based on a pretty standard style of type checker applied to the MIR. Essentially there is an environment that maps each variable to a type.
Env = { X -> Type }
Type = scalar | & 'Y T | ...
Then we have type-checking inference rules that thread this same environment everywhere. Conceptually the structure of the the rules is as follows:
construct Env from local variable declarations
Env |- each basic block type checks
--------------------------
the MIR type checks
Type-checking a place then uses this Env
, bottoming out in an inference rule like:
Env[X] = T
-------------
Env |- X : T
Flow-sensitive type check
The key thing that makes the borrow checker flow insensitive is that we use the same environment at all points. What if instead we had one environment per program point:
EnvAt = { Point -> Env }
Whenever we type check a statement at program point A
, we will use EnvAt[A]
as its environment. When program point A
flows into point B
, then the environment at A
must be a subenvironment of the environment at B
, which we write as EnvAt[A] <: EnvAt[B]
.
The subenvironment relationship Env1 <: Env2
holds if
- for each variable
X
inEnv2
:X
appears inEnv1
Env1[X] <: Env2[X]
There are two interesting things here. The first is that the set of variables can change over time. The idea is that once a variable goes dead, you can drop it from the environment. The second is that the type of the variable can change according to the subtyping rules.
You can think of flow-sensitive typing as if, for each program variable like q
, we have a separate copy per program point, so q@A
for point A
and q@B
for point at B
. When we flow from one point to another, we assign from q@A
to q@B
. Like any assignment, this would require the type of q@A
to be a subtype of the type of q@B
.
Flow-sensitive typing in our example
Let’s see how this idea of a flow-sensitive type check plays out for our example. First, recall the MIR for our example from the previous post:
flowchart TD Intro --> BB1 Intro["let mut x: i32\nlet mut y: i32\nlet mut p: &'0 i32\nlet mut q: &'1 i32"] BB1["BB1:\np = &x;\ny = y + 1;\nq = &y;\nif something goto BB2 else BB3"] BB1 --> BB2 BB1 --> BB3 BB2["BB2\np = q;\nx = x + 1;\n"] BB3["BB3\ny = y + 1;"] BB2 --> BB4; BB3 --> BB4; BB4["BB4\ny = y + 1;\nread_value(p);\n"] classDef default text-align:left,fill-opacity:0;
One environment per program point
In the original, flow-insensitive type check, the first thing we did was to create origin variables ('0
, '1
) for each of the origins that appear in our types. You can see those variables in the chart above. So we effectively had an environment like
Env_flow_insensitive = {
p: &'0 i32,
q: &'1 i32,
}
But now we are going to have one environment per program point. There is one program point in between each MIR statement. So the point BB1_0
would be the entry to basic block BB1
, and BB1_1
would be after the first statement. So we have Env_BB1_0
, Env_BB1_1
, etc. We are going to create distinct origin variables for each of them:
Env_BB1_0 = {
p: &'0_BB1_0 i32,
q: &'1_BB1_0 i32,
}
Env_BB1_1 = {
p: &'0_BB1_1 i32,
q: &'1_BB1_1 i32,
}
...
Type-checking the edge from BB1 to BB2
Let’s look at point BB1_3
, which is the final line in BB1, which in MIR-speak is called the terminator. It is an if terminator (if something goto BB2 else BB3
). To type-check it, we will take the environment on entry (Env_BB1_3
) and require that it is a sub-environment of the environment on entry to the true branch (Env_BB2_0
) and on entry to the false branch (Env1_BB3_0
).
Let’s start with the true branch. Here we have the environment Env_BB2_0
:
Env_BB2_0 = {
q: &'1_BB2_0 i32,
}
You should notice something curious here – why is there no entry for p
? The reason is that the variable p
is dead on entry to BB2, because its current value is about to be overridden. The type checker knows not to include dead variables in the environment.
This means that…
Env_BB1_3 <: Env_BB2_0
if the type ofq
atBB1_3
is a subtype of the type ofq
atBB2_0
…- …so
&'1_BB1_3 i32 <: &'1_BB2_0 i32
must hold… - …so
'1_BB1_3 : '1_BB2_0
must hold.
What we just found then is that, because of the edge from BB1 to BB2, the version of '1
on exit from BB1 flows into '1
on entry to BB2.
Type-checking the p = q
assignment
let’s look at the assignment p = q
. This occurs in statement BB2_0. The environment before we just saw:
Env_BB2_0 = {
q: &'1_BB2_0 i32,
}
For an assignment, we take the type of the left-hand side (p
) from the environment after, because that is what we are storing into. The environment after is Env_BB2_1
:
Env_BB2_1 = {
p: &'0_BB2_1 i32,
}
And so to type check the statement, we get that &'1_BB2_0 i32 <: &'0 BB2_1 i32
, or '1_BB2_0 : '0_BB2_1
.
In addition to this relation from the assignment, we also have to make the environment Env_BB2_0
be a subenvironment of the env after Env_BB2_1
. But since the set of live variables are disjoint, in this case, that doesn’t add anything to the picture.
Type-checking the edge from BB1 to BB3
As the final example, let’s look at the false edge from BB1 to BB3. On entry to BB3, the variable q
is dead but p
is not, so the environment looks like
Env_BB3_0 = {
p: &'0_BB3_0 i32,
}
Following a similar process to before, we conclude that '0_BB1_3 : '0_BB3_0
.
Building the flow-sensitive subset graph
We are now starting to see how we can build a flow-sensitive version of the flow graph. Instead of having one node in the graph per origin variable, we now have one node in the graph per origin variable per program point, and we create an edge N1 -> N2
between two nodes if the type check requires that N1 : N2
, just as before. Basically the only difference is that we have a lot more nodes.
Putting together what we saw thus far, we can construct a subset graph for this program like the following. I’ve excluded nodes that correspond to dead variables – so for example there is no node '1_BB1_0
, because '1
appears in the variable q
, and q
is dead at the start of the program.
flowchart TD subgraph "'0" N0_BB1_0["'0_BB1_0"] N0_BB1_1["'0_BB1_1"] N0_BB1_2["'0_BB1_2"] N0_BB1_3["'0_BB1_3"] N0_BB2_1["'0_BB2_1"] N0_BB3_0["'0_BB3_0"] N0_BB4_0["'0_BB4_0"] N0_BB4_1["'0_BB4_1"] end subgraph "'1" N1_BB1_2["'1_BB1_2"] N1_BB1_3["'1_BB1_3"] N1_BB2_0["'1_BB2_0"] end subgraph "Loans" L0["{L0} (&x)"] L1["{L1} (&y)"] end L0 --> N0_BB1_0 L1 --> N1_BB1_2 N0_BB1_0 --> N0_BB1_1 --> N0_BB1_2 --> N0_BB1_3 N0_BB1_3 --> N0_BB3_0 N0_BB3_0 --> N0_BB4_0 --> N0_BB4_1 N0_BB2_1 --> N0_BB4_0 N1_BB1_2 --> N1_BB1_3 N1_BB1_3 --> N1_BB2_0 N1_BB2_0 --> N0_BB2_1
Just as before, we can trace back from the node for a particular origin O to find all the loans contained within O. Only this time, the origin O also indicates a program point.
In particular, compare '0_BB3_0
(the data reachable from p
on the false
branch of the if) to '0_BB4_0
(the data reachable after the if finishes). We can see that in the first case, the origin can only reference L0
, but afterwards, it could reference L1
.
Active loans
Just as in described in the previous post, to complete the analysis we compute the active loans. Active loans are defined in almost exactly the same way, but with one twist. A loan L
is active at a program point P
if there is a path from the borrow that created L
to P
where, for each point along the path…
- there is some live variable whose type at
P
may reference the loan; and, - the place expression that was borrowed by
L
(here,x
) is not reassigned atP
.
See the bolded test? We are now taking into account the fact that the type of the variable can change along the path. In particular, it may reference distinct origins.
Implementing using dataflow
Just as in the previous post, we can compute active loans using dataflow. In particular, we gen a loan when it is issued, and we kill a loan L
at a point P
if (a) there are no live variables whose origins contain L
or (b) the path borrowed by L
is assigned at P
.
Applying this to our running example
When we apply this to our running example, the unnecessary error on the false
branch of the if
goes away. Let’s walk through it.
Entry block
In BB1
, we gen L0
and L1
at their two borrow sites, respectively. As a result, the active loans on exit from BB1
wil be {L0, L1}
:
flowchart TD Start["..."] BB1["BB1: p = &x; // Gen: L0 y = y + 1; q = &y; // Gen: L1 if something goto BB2 else BB3 "] BB2["..."] BB3["..."] BB4["..."] Start --> BB1 BB1 --> BB2 BB1 --> BB3 BB2 --> BB4 BB3 --> BB4 classDef default text-align:left,fill:#ffffff; classDef highlight text-align:left,fill:yellow; class BB3 highlight
The false
branch of the if
On the false
branch of the if
(BB3
), the only live reference is p
, which will be used later on in BB4
. In particular, q
is dead.
In the flow insensitive version, when the borrow checker looked at the type of p
, it was p: &'0 i32
, and '0
had the value {L0, L1}
, so the borrow checker concluded that both loans were active.
But in the flow sensitive version we are looking at now, the type of p
on entry to BB3
is p: &'0_BB3_0 i32
. And, consulting the subset graph shown earlier in this post, the value of '0_BB3_0
is just {L0}
. So there is a kill for L1
on entry to the block. This means that the only active loan is L0
, which borrows x
. This in turn means that y = y + 1
is not an error.
flowchart TD Start[" ... "] BB1[" BB1: p = &x; // Gen: L0 ... q = &y; // Gen: L1 ... "] BB2[" BB2: ... "] BB3[" BB3: // Kill `L1` (no live references) // Active loans: {L0} y = y + 1; "] BB4[" BB4: ... read_value(p); // later use of `p` "] Start --> BB1 BB1 --> BB2 BB1 --> BB3 BB2 --> BB4 BB3 --> BB4 classDef default text-align:left,fill:#ffffff; classDef highlight text-align:left,fill:yellow; class BB3 highlight
The role of invariance: vec-push-ref
I didn’t highlight it before, but invariance plays a really interesting role in this analysis. Let’s see another example, a simplified version of vec-push-ref
from polonius:
let v: Vec<&'v u32>;
let p: &'p mut Vec<&'vp u32>;
let x: u32;
/* P0 */ v = vec![];
/* P1 */ p = &mut v; // Loan L0
/* P2 */ x += 1; // <-- Expect NO error here.
/* P3 */ p.push(&x); // Loan 1
/* P4 */ x += 1; // <-- 💥 Expect an error here!
/* P5 */ drop(v);
What makes this interesting? We create a reference p
at point P1
that points at v
. We then insert a borrow of x
into the reference p
. After that point, the reference p
is dead, but the loan L1
is still active – this is because it is also stored in v
. This connection between p
and v
is what is key about this example.
The way that this connection is reflected in the type system is through variance. In particular, a type &mut T
is invariant with respect to T
. This means that when you assign one reference to another, the type that they reference must be exactly the same.
In terms of the subset graph, invariance works out to creating bidirectional edges between origins. Take a look at the resulting subset graph to see what I mean. To keep things simple, I am going to exclude nodes for p
: the interesting origins here at 'v
(the data in the vector v
) and 'vp
(the data in the vector referenced by p
– which is also v
).
flowchart TD subgraph "Loans" L1["L1 (&x)"] end subgraph "'v" V_P0["'v_P0"] V_P1["'v_P1"] V_P2["'v_P2"] V_P3["'v_P3"] V_P4["'v_P4"] V_P5["'v_P5"] end subgraph "'vp" VP_P1["'vp_P1"] VP_P2["'vp_P2"] VP_P3["'vp_P3"] end V_P0 --> V_P1 --> V_P2 --> V_P3 --> V_P4 --> V_P5 V_P1 <---> VP_P1 VP_P1 <---> VP_P2 <---> VP_P3 L1 --> VP_P3
The key part here are the bidirectional arrows between v_P1
and vp_P1
and between vp_P1
and vp_P3
. How did those come about?
- The first edge resulted from
p = &mut v
. The type ofv
(atP1
) isVec<&'v_P1 u32>
, and that type had to be equal to the referent ofp
(Vec<&'vp_P1 u32>
). Since the types must be equal, that means'v_P1: 'vp_P1
and vice versa, hence a bidirectional arrow. - The second edge resulted from the flow from
P1
toP3
. The variablep
is live across that edge, so its type before (&'p_P1 mut Vec<&'vp_P1 u32>
) must be a subtype of its type after (&'p_P3 mut Vec<&'vp_P3 u32>
). Because&mut
references are invariant with respect to their referent types, this implies that'vp_P1
and'vp_P3
must be equal.
Put all together, and we see that L1
can reach 'v_P4
and 'v_P5
, even though it only flowed into an earlier point in the graph. That’s cool! We will get the error we expect.
On the other hand, we can also see that there is some imprecision introduced through invariance. The loan L1
is introduced at point P3
, and yet it appears to flow from 'vp_P3
backwards in time to 'vp_P2
, 'vp_P1
, over to 'v_P1
, and downward from there. If we were only looking at the subset graph, then, we would conclude that both x += 1
statements in this program are illegal, but in fact only the second one causes a problem.
Active loans to the rescue (again)
The imprecision we see here is very similar to the imprecision we saw in the original polonius. Effectively, invariance is taking away some of our flow sensitivity. Interestingly, the active loans portion of the analysis makes up for this, in the same way that it did in the previous post. In vec-push-ref, L1
will only be generated at P3
, so even though it can reach 'v_P2
via the subset graph, it is not considered active at P2
. But once it is generated, it is not killed, even when p
goes dead, because it can flow into 'v_P4
. Therefore we get the one error we expect.
Conclusion
I’m going to stop this post here. I’ve described a version of polonius where we give variables distinct types at each program point and then relate those types together to create an improved subset graph. This graph increases the precision of the active loans analysis such that we don’t get as many false errors, but it is still imprecise in some ways.
I think this formulation is interesting for a few reasons. First, the most expensive part of it is going to be the subset graph, which has a LOT of nodes and edges. But that can be compressed significantly with some simple heuristics. Moreover, the core operation we perform on that graph is reachability, and that can be implemented quite efficiently as well (do a strongly connected components computation to reduce the graph to a tree, and then you can assign pre- and post-orderings and just compare indices). So I believe it could scale in practice.
I have worked through a few more classic examples, and I may come back to them in future posts, so far this analysis seems to get the results I expect. However, I would also like to go back and compare it more deeply to the original polonius, as well as to some of the formulations that came out of academia. There is still something odd about leaning on the dataflow check. I hope to talk about some of that in follow-up posts (or perhaps on Zulip or elsewhere with some of you readers!).
If this particular example feels artificial, that’s because it is. But similar errors cause more common errors, most notably Problem Case #3. ↩︎